<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Debug Mind</title>
	<atom:link href="https://www.debugmind.com/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.debugmind.com</link>
	<description>the personal blog of Akshay Agrawal</description>
	<lastBuildDate>Sat, 06 Mar 2021 18:55:11 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=5.6.17</generator>
	<item>
		<title>Paths to the Future: A Year at Google Brain</title>
		<link>https://www.debugmind.com/2020/01/04/paths-to-the-future-a-year-at-google-brain/</link>
					<comments>https://www.debugmind.com/2020/01/04/paths-to-the-future-a-year-at-google-brain/#comments</comments>
		
		<dc:creator><![CDATA[Akshay Agrawal]]></dc:creator>
		<pubDate>Sat, 04 Jan 2020 23:24:56 +0000</pubDate>
				<category><![CDATA[Miscellaneous]]></category>
		<category><![CDATA[convex optimization]]></category>
		<category><![CDATA[Google Brain]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[PhD]]></category>
		<category><![CDATA[programming languages]]></category>
		<guid isPermaLink="false">https://www.debugmind.com/?p=2688</guid>

					<description><![CDATA[I am currently a PhD student at Stanford, studying optimization and machine learning with Stephen Boyd, but from 2017 through 2018 I was a software engineer on the Google Brain team. I started three months after receiving a Master’s degree in computer science (also from Stanford), having just spent the summer working on a research [&#8230;]]]></description>
										<content:encoded><![CDATA[
<p class="has-drop-cap">I am currently a PhD student at Stanford, studying optimization and machine learning with <a href="https://web.stanford.edu/~boyd/index.html">Stephen Boyd</a>, but from 2017 through 2018 I was a software engineer on the <a href="https://research.google/teams/brain/">Google Brain</a> team. I started three months after receiving a Master’s degree in computer science (also from Stanford), having just spent the summer working on a research project—a <a href="http://www.cvxpy.org">domain-specific language for convex optimization</a>. At the time, a part of me wanted to continue working with my advisor, but another part of me was deeply curious about Google Brain. To see a famous artificial intelligence (AI) research lab from the inside would, at the very least, make for an interesting anthropological experience. So, I joined Google. My assignment was to work on <a href="http://www.tensorflow.org">TensorFlow</a>, an open-source software library for deep learning.<br></p>



<figure class="wp-block-image"><img loading="lazy" width="780" height="519" src="https://www.debugmind.com/wp-content/uploads/2020/01/google_small.jpg" alt="" class="wp-image-2910" srcset="https://www.debugmind.com/wp-content/uploads/2020/01/google_small.jpg 780w, https://www.debugmind.com/wp-content/uploads/2020/01/google_small-300x200.jpg 300w, https://www.debugmind.com/wp-content/uploads/2020/01/google_small-768x511.jpg 768w" sizes="(max-width: 780px) 100vw, 780px" /><figcaption>From 2017-2018, I worked as an engineer at Google Brain, Google&#8217;s AI research lab. Brain sits in Google&#8217;s headquarters in Mountain View, CA. Photo by Robbie Shade, licensed CC BY 2.0.</figcaption></figure>



<p>Brain was a magnet for Google&#8217;s celebrity employees. For the past few years, Google&#8217;s CEO Sundar Pichai (who believes AI is <a href="https://www.youtube.com/watch?v=_M_rSFBYEe8">&#8220;more profound than electricity or fire&#8221;</a>) has emphasized that Google is an “AI-first” company, with the company <a href="https://ai.googleblog.com/2018/05/introducing-google-ai.html">seeking to implement</a> machine learning in nearly everything do. In a single afternoon, in the team&#8217;s kitchenette, I saw Pichai, co-founder Sergey Brin, and Turing award winners David Patterson and John Hennessey.  <br></p>



<p>I didn&#8217;t work with these celebrity employees, but I did get to work with some of the original TensorFlow developers. These developers gave me guidance when I sought it and habitually gave me more credit than I deserved. For example, my coworkers let me take the lead in writing an academic paper about <a href="https://arxiv.org/abs/1903.01855">TensorFlow 2</a>, even though my contributions to the technology were smaller than theirs. The unreasonable amount of trust placed in me, and credit given to me, made me work harder than I would have otherwise.<br></p>



<p class="has-drop-cap">The culture of Google Brain reminded me of what I’ve read about Xerox PARC. During the 1970s, researchers at PARC paved the way for the personal computing revolution by developing  graphical user interfaces and producing one of the earliest incarnations of a desktop computer.</p>



<p>PARC&#8217;s culture is documented in <em><a href="https://www.debugmind.com/files/alan-kay-context.pdf">The Power of the Context</a></em>, an essay written by PARC researcher Alan Kay. Kay describes PARC as a place where senior employees treated less experienced ones as “world-class researchers who just haven’t earned their PhDs yet” (similar to how my coworkers treated me). Kay goes on to say that researchers at PARC were self-motivated and capable “artists,” working independently or in small teams towards similar visions. This made for a productive environment that at times felt “out of control”:<br></p>



<blockquote class="wp-block-quote"><p>A great vision acts like a magnetic field from the future that aligns all the little iron particle artists to point to “North” without having to see it. They then make their own paths to the future. Xerox often was shocked at the PARC process and declared it out of control, but they didn’t understand that the context was so powerful and compelling and the good will so abundant, that the artists worked happily at their version of the vision. The results were an enormous collection of breakthroughs, some of which we are celebrating today.<br></p></blockquote>



<p>At Brain, as at PARC, researchers and engineers had an incredible amount of autonomy. They had bosses, to be sure, but they had an a lot of leeway in choosing what to work on — in finding “their own paths to the future.” (I say &#8220;had&#8221;, not &#8220;have&#8221;, since I&#8217;m not sure whether Brain&#8217;s culture has changed since I&#8217;ve left.)</p>



<p>I&#8217;ll give one example: a few years ago, many on the Google Brain team realized that machine learning tools were closer to programming languages than to libraries, and that redesigning their tools with this fact in mind would unlock greater productivity. Management didn&#8217;t command engineers  to work on a particular solution to this problem. Instead, several small teams formed organically, each approaching the problem in its own way. TensorFlow 2.0, Swift for TensorFlow, JAX, Dex, Tangent, Autograph, and MLIR were all different angles on the same vision. Some were in direct tension with each other, but each was improved by the existence of the other—we shared notes often, and re-used each other&#8217;s solutions when possible. It&#8217;s totally possible that many of these tools might not become anything more than promising experiments, but it&#8217;s also possible that at least one will be a breakthrough. <br></p>



<figure class="wp-block-image"><img loading="lazy" width="1024" height="262" src="https://www.debugmind.com/wp-content/uploads/2020/01/logos-1-1024x262.png" alt="" class="wp-image-2866" srcset="https://www.debugmind.com/wp-content/uploads/2020/01/logos-1-1024x262.png 1024w, https://www.debugmind.com/wp-content/uploads/2020/01/logos-1-300x77.png 300w, https://www.debugmind.com/wp-content/uploads/2020/01/logos-1-768x196.png 768w, https://www.debugmind.com/wp-content/uploads/2020/01/logos-1.png 1687w" sizes="(max-width: 1024px) 100vw, 1024px" /><figcaption>TF 2.0, Swift for TensorFlow, and JAX, developed by separate sub-teams within Google Brain, are different paths to the same vision—an enjoyable, expressive, and performant programming language for machine learning. </figcaption></figure>



<p>I would guess that the PARC-like context that Brain operated in was instrumental in bringing about the creation of TensorFlow.  In late 2015, Google open-sourced TensorFlow, making it freely available to the entire world. TensorFlow quickly became enormously popular. Instructors at Stanford  and other universities used it in their curricula (my friend <a href="https://huyenchip.com/">Chip Huyen</a>,  for example, created a Stanford course called <a href="http://web.stanford.edu/class/cs20si/">TensorFlow for Deep Learning Research</a>),  researchers across the world used it to run experiments, and  companies used it to train and deploy models in the real world. Today,  TensorFlow is the fifth most popular project on Github out of the many millions of public software repositories available on it, as measured by  star count.<br></p>



<p>And yet, at least for TensorFlow, Google Brain’s hyper-creative, hyper-productive, and “out of control” culture was a double-edged sword. In the process of making their own paths to a shared future, TensorFlow engineers released many features sharing a similar purpose. Many of these features were subsequently deemphasized in favor of more promising ones. While this process might have selected for good features (like tf.data and eager execution), it <a href="https://nicodjimenez.github.io/2017/10/08/tensorflow.html">frustrated</a> and <a href="https://nostalgebraist.tumblr.com/post/189464877164/attention-conservation-notice-machine-learning">exhausted</a> our users, who struggled to keep up. </p>



<figure class="wp-block-image"><img loading="lazy" width="762" height="414" src="https://www.debugmind.com/wp-content/uploads/2020/01/tpu.jpg" alt="" class="wp-image-2841" srcset="https://www.debugmind.com/wp-content/uploads/2020/01/tpu.jpg 762w, https://www.debugmind.com/wp-content/uploads/2020/01/tpu-300x163.jpg 300w" sizes="(max-width: 762px) 100vw, 762px" /><figcaption>A Google TPU, a hardware accelerator for machine learning. While at Brain, I created a custom TensorFlow operation that made it easier to load balance computations across TPU cores.</figcaption></figure>



<p>Brain differed from PARC in at least one way: unlike PARC, which infamously failed to commercialize its research, Google productionized projects that were incubated in Brain.  Examples include Google Translate, the BERT language model (which informs Google search), TPUs (hardware accelerators that Google rents to external clients, and uses internally for a variety of production projects), and Google Cloud AI (which sells AutoML as a service). In this sense Google Brain was a natural extension of Larry Page&#8217;s desire to work with people who want to do &#8220;crazy world-breaking things&#8221; while having &#8220;one foot in industry&#8221; (as Page stated in an interview with Walter Isaacson.)</p>



<p class="has-drop-cap">Leaving Google Brain for a PhD was difficult. I had grown accustomed to the perks, and I appreciated the team&#8217;s proximity to research. Most of all I loved working alongside a large team on TensorFlow 2.0—I&#8217;m passionate about building better tools, for better minds. But I also love the creative expression that research provides.</p>



<p class="has-text-align-left">I&#8217;m often asked why I didn&#8217;t simply involve myself in research at Brain, instead of enrolling in a PhD program. Here&#8217;s why: the zeitgeist had little room for topics other than deep learning and reinforcement learning. Indeed, in 2018, Google <a href="https://web.archive.org/web/20200306181037/https://ai.googleblog.com/2018/05/introducing-google-ai.html">rebranded</a> &#8220;Google Research&#8221; to &#8220;Google AI,&#8221; redirecting <em>research.google.com</em> to <em>ai.google.com</em>. (The rebranding understandably <a href="https://twitter.com/random_walker/status/993850870067548160">raised some eyebrows</a>. It appears that change was quietly rolled back sometime recently, and the <a href="http://research.google.com">Google Research brand</a> has been resurrected.) While I’m interested in machine learning, I’m not convinced that today’s AI is anywhere near as profound as electricity or fire, and I wanted to be trained in a more intellectually diverse environment.</p>



<p>In fact, most of my mentors at Brain encouraged me to enroll in the PhD program. Only one researcher strongly discouraged me from pursuing a PhD, comparing the experience to “psychological torture.” I was so shocked by his dark warning that I didn’t ask any follow-up questions, he didn’t elaborate, and our meeting ended shortly afterwards.</p>



<p class="has-drop-cap">These days, in addition to machine learning, I&#8217;m interested in in convex optimization, a branch of computational mathematics concerned with making optimal choices. Convex optimization has many real-world applications—SpaceX uses it to land rockets, self-driving cars use it to track trajectories, financial companies use it to design investment portfolios, and, yes, machine learning engineers use it to train models. While well-studied, as a technology, convex optimization is still young and niche. I suspect that convex optimization has the potential to become a powerful, widely-used technology. I&#8217;m interested in doing the work—a bit of math and a bit of computer science—to realize its potential. My advisor at Stanford, <a href="http://web.stanford.edu/~boyd/">Stephen Boyd</a>, is perhaps the world&#8217;s leading expert on applications of convex optimization, and I simply could not pass up an opportunity to do useful research under his guidance. </p>



<figure class="wp-block-image"><img loading="lazy" width="780" height="520" src="https://www.debugmind.com/wp-content/uploads/2020/01/spacex-1.jpg" alt="" class="wp-image-2854" srcset="https://www.debugmind.com/wp-content/uploads/2020/01/spacex-1.jpg 780w, https://www.debugmind.com/wp-content/uploads/2020/01/spacex-1-300x200.jpg 300w, https://www.debugmind.com/wp-content/uploads/2020/01/spacex-1-768x512.jpg 768w" sizes="(max-width: 780px) 100vw, 780px" /><figcaption>SpaceX solves convex optimization problems onboard to land its rockets, using CVXGEN, a code generator for quadratic programming developed at Stephen Boyd&#8217;s Stanford lab. Photo by SpaceX, licensed CC BY-NC 2.0.<br></figcaption></figure>



<p>It&#8217;s been just over a year since I left Google and started my PhD. Since then, I&#8217;ve collaborated with my lab to publish <a href="https://www.akshayagrawal.com/#publications">several papers</a>, including one that makes it possible to automatically learn the structure of convex optimization problems, <a href="http://web.stanford.edu/~boyd/papers/pdf/diff_cvxpy.pdf">bridging the gap between convex optimization and deep learning</a>. I&#8217;m now one of three core developers of <a href="https://github.com/cvxgrp/cvxpy">CVXPY</a>, an open-source library for convex optimization, and I have total creative control over my research and engineering projects. </p>



<p>There are many things about Google Brain that I miss, my coworkers most of all. But now, at Stanford, I get to collaborate with and learn from an intellectually diverse group of extremely smart and passionate individuals, ranging from pure mathematicians, electrical and chemical engineers, physicists, biologists, and computer scientists. </p>



<p>I&#8217;m not sure what I&#8217;ll do once I graduate, but for now, I&#8217;m having a lot of fun—and learning a ton—doing a bit of math, writing papers, shipping real software, and exploring several lines of research in parallel. If I&#8217;m very lucky, one of them might even be a breakthrough.<br></p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.debugmind.com/2020/01/04/paths-to-the-future-a-year-at-google-brain/feed/</wfw:commentRss>
			<slash:comments>17</slash:comments>
		
		
			</item>
		<item>
		<title>A Primer on TensorFlow 2.0</title>
		<link>https://www.debugmind.com/2019/04/07/a-primer-on-tensorflow-2-0/</link>
					<comments>https://www.debugmind.com/2019/04/07/a-primer-on-tensorflow-2-0/#comments</comments>
		
		<dc:creator><![CDATA[Akshay Agrawal]]></dc:creator>
		<pubDate>Mon, 08 Apr 2019 02:20:53 +0000</pubDate>
				<category><![CDATA[Miscellaneous]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[programming languages]]></category>
		<category><![CDATA[tutorial]]></category>
		<guid isPermaLink="false">https://www.debugmind.com/?p=2142</guid>

					<description><![CDATA[TensorFlow (TF) 2.0 is a significant, backwards-incompatible update to TF's execution model and API. In TF 2.0, all operations execute imperatively by default, and tf.keras is the only high-level API  for neural networks. Graphs are still available via just-in-time tracer.]]></description>
										<content:encoded><![CDATA[
<div class="wp-block-image"><figure class="aligncenter is-resized"><img loading="lazy" src="https://www.debugmind.com/wp-content/uploads/2019/04/tf.png" alt="" class="wp-image-2145" width="197" height="235" srcset="https://www.debugmind.com/wp-content/uploads/2019/04/tf.png 563w, https://www.debugmind.com/wp-content/uploads/2019/04/tf-251x300.png 251w" sizes="(max-width: 197px) 100vw, 197px" /></figure></div>



<p><em>This post is also available as a <a href="https://colab.research.google.com/drive/1gEkd_D-b8Y0Quxdz-OoJL_sFWGhP5lSR">Python notebook</a>.</em></p>



<p>From September 2017 to October 2018, I worked on TensorFlow 2.0 alongside many engineers. In this post, I&#8217;ll explain what TensorFlow 2.0 is and how it differs from TensorFlow 1.x. Towards the end, I&#8217;ll briefly compare TensorFlow 2.0 to PyTorch 1.0. This post represents my own views; it does not represent the views of Google, my former employer.</p>



<p>TensorFlow (TF) 2.0 is a significant, backwards-incompatible update to TF&#8217;s execution model and API. </p>



<p><strong>Execution model.</strong> In TF 2.0, all operations execute imperatively by default. Graphs and the graph runtime are both abstracted away by a just-in-time tracer that translates Python functions executing TF operations into executable <em>graph functions.</em> This means in TF 2.0, there is no <code>Session</code>, and no global graph state. The tracer is exposed as a Python decorator, <code>tf.function</code>. This decorator is for advanced users. Using it is completely optional.</p>



<p><strong>API.</strong> TF 2.0 makes <code>tf.keras</code> <em>the</em> high-level API for constructing and training neural networks. But you don&#8217;t have to use Keras if you don&#8217;t want to. You can instead use lower-level operations and automatic differentiation directly.</p>



<p>To follow along with the code examples in this post, install the <a href="https://tensorflow.org/alpha">TF 2.0 alpha</a>. </p>



<pre class="wp-block-code"><code>pip install tensorflow==2.0.0-alpha0</code></pre>



<pre class="wp-block-code"><code>import tensorflow as tf
tf.__version__</code></pre>



<pre class="wp-block-code wp-block-output"><code>'2.0.0-alpha0'</code></pre>



<p><em>Contents</em></p>



<ol><li><a href="#why-tf-2-0">Why TF 2.0?</a></li><li><a href="#imperative-execution">Imperative execution</a></li><li><a href="#state">State</a></li><li><a href="#automatic-differentiation">Automatic differentiation</a></li><li><a href="#keras">Keras</a></li><li><a href="#graph-functions">Graph functions</a></li><li><a href="#comparison-to-other-python-libraries">Comparison to other Python libraries</a></li><li><a href="#domain-specific-languages-for-machine-learning">Domain-specific languages for machine learning</a></li></ol>



<span id="more-2142"></span>



<h2 id="why-tf-2-0">I. Why TF 2.0?</h2>



<p>TF 2.0 largely exists to make TF easier to use, for newcomers and researchers alike.</p>



<h4>TF 1.x requires metaprogramming</h4>



<p>TF 1.x was designed to train extremely large, static neural networks. Representing a model as a dataflow graph and separating its specification from its execution simplifies training at scale, which explains why TF 1.x uses Python as a <a href="https://en.wikipedia.org/wiki/Declarative_programming">declarative </a><a href="https://en.wikipedia.org/wiki/Metaprogramming">metaprogramming</a> tool for graphs.</p>



<p>But most people don&#8217;t need to train Google-scale models, and most people find metaprogramming difficult. Constructing a TF 1.x graph is like writing assembly code, and this abstraction is so low-level that it is hard to produce anything but the simplest <a href="https://www.facebook.com/yann.lecun/posts/10155003011462143">differentiable programs</a> using it. Programs that have data-dependent <a href="https://en.wikipedia.org/wiki/Control_flow">control flow</a> are particularly hard to express as graphs.</p>



<h4>Metaprogramming is (often) unnecessary</h4>



<p>It is possible to implement automatic differentiation by tracing computations while they are executed, without static graphs; <a href="https://chainer.org/">Chainer</a>, <a href="https://openreview.net/pdf?id=BJJsrmfCZ">PyTorch</a>, and <a href="https://github.com/HIPS/autograd">autograd</a> do exactly that. These libraries are substantially easier to use than TF 1.x, since <a href="https://en.wikipedia.org/wiki/Imperative_programming">imperative programming</a> is so much more natural than declarative programming. Moreover, when training models with large operations on a single machine, these graph-free libraries are competitive with TF 1.x performance. For these reasons, TF 2.0 privileges imperative execution.</p>



<p>Graphs are still sometimes useful, for distribution, serialization, code generation, deployment, and (sometimes) performance. That&#8217;s why TF 2.0 provides the just-in-time tracer <code>tf.function</code>, which transparently converts Python functions into functions backed by graphs. This tracer also rewrites tensor-dependent Python control flow to TF control flow, and it automatically adds control dependencies to order reads and writes to TF state. This means that constructing graphs via <code>tf.function</code> is much easier than constructing TF 1.x graphs manually.</p>



<h4>Multi-stage programming</h4>



<p>The ability to create polymorphic graph functions via <code>tf.function</code> at runtime makes TF 2.0 similar to a <a href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.438.6924&amp;rep=rep1&amp;type=pdf"><em>multi-stage</em> programming</a> language. </p>



<p>For TF 2.0, I recommend the following multi-stage workflow. Start by implementing your program in imperative mode. Once you&#8217;re satisfied that your program is correct, measure its performance. If the performance is unsatisfactory, analyze your program using <code>cProfile</code> or a comparable tool to find bottlenecks consisting of TF operations. Next, refactor the bottlenecks into Python functions, and stage these functions in graphs with <code>tf.function</code>.</p>



<div class="wp-block-image"><figure class="aligncenter is-resized"><img loading="lazy" src="https://www.debugmind.com/wp-content/uploads/2019/04/image.png" alt="" class="wp-image-2220" width="199" height="208" srcset="https://www.debugmind.com/wp-content/uploads/2019/04/image.png 766w, https://www.debugmind.com/wp-content/uploads/2019/04/image-287x300.png 287w" sizes="(max-width: 199px) 100vw, 199px" /><figcaption>A multi-stage workflow for TF 2.0.</figcaption></figure></div>



<p>If you mostly use TF 2.0 to train large deep models, you probably won&#8217;t need to analyze or stage your programs. If on the other hand you write programs that execute lots of small operations, like MCMC samplers or reinforcement learning algorithms, you&#8217;ll likely find this workflow useful. In such cases, the Python overhead incurred by executing operations eagerly actually matters.</p>



<h2 id="imperative-execution">II. Imperative execution</h2>



<p>In TF 2.0, all operations are executed imperatively, or &#8220;eagerly&#8221;, by default. If you&#8217;ve used NumPy or PyTorch, TF 2.0 will feel familiar. For example, the following line of code will immediately construct two tensors backed by numerical tensors and then execute the <code>add</code> operation.</p>



<pre class="wp-block-code"><code>tf.constant([1., 2.]) + tf.constant([3., 4.])</code></pre>



<pre class="wp-block-code wp-block-output"><code>&lt;tf.Tensor: id=1440, shape=(2,), dtype=int32, numpy=array([4, 6], dtype=float32)></code></pre>



<p>Contrast the above code snippet to its verbose, awkward TF 1.x equivalent:</p>



<pre class="wp-block-code"><code># TF 1.X code
x = tf.placeholder(tf.float32, shape=[2])
y = tf.placeholder(tf.float32, shape=[2])
value = x + y

with tf.Session() as sess:
  print(sess.run(value, feed_dict={x: [1., 2.], y: [3., 4.]}))</code></pre>



<p>In TF 2.0, there are no placeholders, no sessions, and no feed dicts. Because operations are executed immediately, you can use (and differentiate through) <code>if</code> statements and <code>for</code> loops (no more <code>tf.cond</code>  or <code>tf.while_loop</code>). You can also use whatever Python data structures you like, and debug your programs with print statements and <code>pdb</code>.</p>



<p>If TF detects that a GPU is available, it will automatically run operations on the GPU when possible. The target device can also be controlled explicitly. </p>



<pre class="wp-block-code"><code>if tf.test.is_gpu_available():
  with tf.device('gpu:0'):
    tf.constant([1., 2.]) + tf.constant([3., 4.])</code></pre>



<h2 id="state">III. State</h2>



<p>Using <code>tf.Variable</code> objects in TensorFlow required wrangling global collections of graph state, with confusing APIs like <code>tf.get_variable</code>,  <code>tf.variable_scope</code>, and <code>tf.initializers.global_variables</code>. TF 2.0 does away with global collections and their associated APIs. If you need a <code>tf.Variable</code> in TF 2.0, then you just construct and initialize it directly:</p>



<pre class="wp-block-code"><code>tf.Variable(tf.random.normal([3, 5]))</code></pre>



<pre class="wp-block-code wp-block-output"><code>&lt;tf.Variable 'Variable:0' shape=(3, 5) dtype=float32, numpy=
array([[ 0.13141578, -0.18558209,  1.2412338 , -0.5886968 , -0.9191646 ],
       [ 1.186105  , -0.45135704,  0.57979995,  0.12573312, -0.7697861 ],
       [ 0.28296474,  1.2735683 , -0.08385598,  0.59388596, -0.2402552 ]],
      dtype=float32)></code></pre>



<h2 id="automatic-differentiation">IV. Automatic differentiation</h2>



<p>TF 2.0 implements reverse-mode <a href="https://books.google.com/books/about/Evaluating_Derivatives.html?id=xoiiLaRxcbEC">automatic differentiation</a> (also known as backpropagation), using a trace-based mechanism. This trace, or <em>tape</em>, is exposed as a context manager, <code>tf.GradientTape</code>. The <code>watch</code> method designates a Tensor as something that we&#8217;ll need to differentiate with respect to later. Notice that by tracing the computation of <code>dy_dx</code> under the first tape, we&#8217;re able to compute <code>d2y_dx2</code>.</p>



<pre class="wp-block-code"><code>x = tf.constant(3.0)
with tf.GradientTape() as t1:
  with tf.GradientTape() as t2:
    t1.watch(x)
    t2.watch(x)
    y = x * x
  dy_dx = t2.gradient(y, x)
d2y_dx2 = t1.gradient(dy_dx, x)</code></pre>



<pre class="wp-block-code"><code>dy_dx</code></pre>



<pre class="wp-block-code wp-block-output"><code>&lt;tf.Tensor: id=62, shape=(), dtype=float32, numpy=6.0></code></pre>



<pre class="wp-block-code"><code>d2y_dx2</code></pre>



<pre class="wp-block-code wp-block-output"><code>&lt;tf.Tensor: id=68, shape=(), dtype=float32, numpy=2.0></code></pre>



<p><code>tf.Variable</code> objects are watched automatically by tapes. </p>



<pre class="wp-block-code"><code>x = tf.Variable(3.0)
with tf.GradientTape() as t1:
  with tf.GradientTape() as t2:
    y = x * x
  dy_dx = t2.gradient(y, x)
d2y_dx2 = t1.gradient(dy_dx, x)</code></pre>



<h2 id="keras">V. Keras</h2>



<p>TF 1.x is notorious for having many mutually incompatible high-level APIs for neural networks. TF 2.0 has just one high-level API: <code>tf.keras</code>, which essentially implements the <a href="https://keras.io/">Keras API</a> but is customized for TF. Several standard layers for neural networks are available in the <code>tf.keras.layers</code>  namespace. </p>



<p>Keras layers can be composed via <code>tf.keras.Sequential()</code>  to obtain an object representing their composition. For example, the below code trains a toy CNN on MNIST. (Of course, MNIST can be solved by much simpler methods, like <a href="http://vmls-book.stanford.edu/vmls.pdf">least squares</a>.)</p>



<pre class="wp-block-code"><code>mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
input_shape=[28, 28, 1]
data_format="channels_last"

max_pool = tf.keras.layers.MaxPooling2D(
      (2, 2), (2, 2), padding='same', data_format=data_format)

model = tf.keras.Sequential([
  tf.keras.layers.Reshape(target_shape=input_shape,
    input_shape=[28, 28]),
  tf.keras.layers.Conv2D(32,5,
    padding='same', data_format=data_format,
    activation=tf.nn.relu),
  max_pool,
  tf.keras.layers.Conv2D(64, 5,
    padding='same', data_format=data_format,
    activation=tf.nn.relu),
  max_pool,
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(1024, activation=tf.nn.relu),
  tf.keras.layers.Dropout(0.3),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])</code></pre>



<pre class="wp-block-code"><code>model.compile(optimizer=tf.optimizers.Adam(),
              loss=tf.losses.sparse_categorical_crossentropy,
              metrics=['accuracy'])</code></pre>



<pre class="wp-block-code"><code>model.fit(x_train, y_train, epochs=1)</code></pre>



<pre class="wp-block-code wp-block-output"><code>60000/60000 [==============================] - 238s 4ms/sample - loss: 0.3417 - accuracy: 0.9495</code></pre>



<p>Alternatively, the same model could have been written as a subclass of <code>tf.keras.Model</code>.</p>



<pre class="wp-block-code"><code>class ConvNet(tf.keras.Model):
  def __init__(self, input_shape, data_format):
    super(ConvNet, self).__init__()
    self.reshape = tf.keras.layers.Reshape(
      target_shape=input_shape, input_shape=[28, 28])
    self.conv1 = tf.keras.layers.Conv2D(32,5,
      padding='same', data_format=data_format,
      activation=tf.nn.relu)
    self.pool = tf.keras.layers.MaxPooling2D(
      (2, 2), (2, 2), padding='same', data_format=data_format)
    self.conv2 = tf.keras.layers.Conv2D(64, 5,
      padding='same', data_format=data_format,
      activation=tf.nn.relu)
    self.flt = tf.keras.layers.Flatten()
    self.d1 = tf.keras.layers.Dense(1024, activation=tf.nn.relu)
    self.dropout = tf.keras.layers.Dropout(0.3)
    self.d2 = tf.keras.layers.Dense(10, activation=tf.nn.softmax)

  def call(self, x):
    x = self.reshape(x)
    x = self.conv1(x)
    x = self.pool(x)
    x = self.conv2(x)
    x = self.pool(x)
    x = self.flt(x)
    x = self.d1(x)
    x = self.dropout(x)
    return self.d2(x)</code></pre>



<p>If you don&#8217;t want to use <code>tf.keras</code>, you can use low-level APIs like <code>tf.reshape</code>, <code>tf.nn.conv2d</code>, <code>tf.nn.max_pool</code>, <code>tf.nn.dropout</code>, and <code>tf.matmul</code> directly.</p>



<h2 id="graph-functions">VI. Graph functions</h2>



<p>For advanced users who need graphs, TF 2.0 provides <code>tf.function</code>, a just-in-time tracer that converts Python functions that execute TensorFlow operations into <em>graph functions</em>. A graph function is a TF graph with named inputs and outputs. Graph functions are executed by a C++ runtime that automatically partitions graphs across devices, and it parallelizes and optimizes them before execution. </p>



<p>Calling a graph function is syntactically equivalent to calling a Python function. Here&#8217;s a very simple example.</p>



<pre class="wp-block-code"><code>@tf.function
def add(tensor):
  return tensor + tensor + tensor</code></pre>



<pre class="wp-block-code"><code># Executes as a dataflow graph
add(tf.ones([2, 2])) </code></pre>



<pre class="wp-block-code wp-block-output"><code>&lt;tf.Tensor: id=1487, shape=(2, 2), dtype=float32, numpy=
array([[3., 3.],
       [3., 3.]], dtype=float32)></code></pre>



<p>The <code>add</code> function is also polymorphic in the data types and shapes of its Tensor arguments (and the run-time values of the non-Tensor arguments), even though TF graphs are not. </p>



<pre class="wp-block-code"><code>add(tf.ones([2, 2], dtype=tf.uint8)) </code></pre>



<pre class="wp-block-code wp-block-output"><code>&lt;tf.Tensor: id=1499, shape=(2, 2), dtype=uint8, numpy=
array([[3, 3],
       [3, 3]], dtype=uint8)></code></pre>



<p>Every time a graph function is called, its &#8220;input signature&#8221; is analyzed. If the input signature doesn&#8217;t match an input signature it has seen before, it re-traces the Python function and constructs another concrete graph function. (In programming languages terms, this is like <a href="https://en.wikipedia.org/wiki/Multiple_dispatch">multiple dispatch</a> or <a href="https://infoscience.epfl.ch/record/150347/files/gpce63-rompf.pdf">lightweight modular staging.</a>) This means that for one Python function, many concrete graph functions might be constructed. This also means that every call that triggers a trace will be slow, but subsequent calls with the same input signature will be much faster. </p>



<h4>Lexical closure, state, and control dependencies</h4>



<p>Graph functions support lexically closing over <code>tf.Tensor</code> and <code>tf.Variable</code> objects. You can mutate <code>tf.Variable</code> objects inside a graph function, and <code>tf.function</code> will automatically add the control dependencies needed to ensure that your reads and writes happen in program-order.</p>



<pre class="wp-block-code"><code>a = tf.Variable(1.0)
b = tf.Variable(1.0)

@tf.function
def f(x, y):
  a.assign(y * b)
  b.assign_add(x * a)
  return a + b

f(tf.constant(1.0), tf.constant(2.0))</code></pre>



<pre class="wp-block-code wp-block-output"><code>&lt;tf.Tensor: id=1569, shape=(), dtype=float32, numpy=5.0></code></pre>



<pre class="wp-block-code"><code>a</code></pre>



<pre class="wp-block-code wp-block-output"><code>&lt;tf.Variable 'Variable:0' shape=() dtype=float32, numpy=2.0></code></pre>



<pre class="wp-block-code"><code>b</code></pre>



<pre class="wp-block-code wp-block-output"><code>&lt;tf.Variable 'Variable:0' shape=() dtype=float32, numpy=3.0></code></pre>



<h4>Python control flow</h4>



<p><code>tf.function</code> automatically rewrites Python control flow that depends on <code>tf.Tensor</code> data into graph control flow, using <a href="https://www.tensorflow.org/guide/autograph">autograph</a>. This means that you no longer need to use constructs like <code>tf.cond</code> and <code>tf.while_loop</code>. For example, if we were to translate the following function into a graph function via <code>tf.function</code>, autograph would convert the <code>for</code> loop into a <code>tf.while_loop</code>, because it depends on <code>tf.range(100)</code>, which is a <code>tf.Tensor</code>. </p>



<pre class="wp-block-code"><code>def matmul_many(tensor):
  accum = tensor
  for _ in tf.range(100):  # will be converted by autograph
    accum = tf.matmul(accum, tensor)
  return accum</code></pre>



<p>It&#8217;s important to note that if <code>tf.range(100)</code> were replaced with <code>range(100)</code>, then the loop would be unrolled, meaning that a graph with 100 <code>matmul</code> operations would be generated.</p>



<p>You can inspect the code that autograph generates on your behalf.</p>



<pre class="wp-block-code"><code>print(tf.autograph.to_code(matmul_many))</code></pre>



<pre class="wp-block-code wp-block-output"><code>from __future__ import print_function

def tf__matmul_many(tensor):
  try:
    with ag__.function_scope('matmul_many'):
      do_return = False
      retval_ = None
      accum = tensor

      def loop_body(loop_vars, accum_1):
        with ag__.function_scope('loop_body'):
          _ = loop_vars
          accum_1 = ag__.converted_call('matmul', tf, ag__.ConversionOptions(recursive=True, verbose=0, strip_decorators=(ag__.convert, ag__.do_not_convert, ag__.converted_call), force_conversion=False, optional_features=ag__.Feature.ALL, internal_convert_user_code=True), (accum_1, tensor), {})
          return accum_1,
      accum, = ag__.for_stmt(ag__.converted_call('range', tf, ag__.ConversionOptions(recursive=True, verbose=0, strip_decorators=(ag__.convert, ag__.do_not_convert, ag__.converted_call), force_conversion=False, optional_features=ag__.Feature.ALL, internal_convert_user_code=True), (100,), {}), None, loop_body, (accum,))
      do_return = True
      retval_ = accum
      return retval_
  except:
    ag__.rewrite_graph_construction_error(ag_source_map__)



tf__matmul_many.autograph_info__ = {}</code></pre>



<h4>Performance</h4>



<p>Graph functions can provide significant speed-ups for programs that execute many small TF operations. For these programs, the Python overhead incurred executing an operation imperatively outstrips the time spent running the operations. As an example, let&#8217;s benchmark the <code>matmul_many</code> function imperatively and as a graph function.</p>



<pre class="wp-block-code"><code>graph_fn = tf.function(matmul_many)</code></pre>



<p>Here&#8217;s the imperative (Python) performance.</p>



<pre class="wp-block-code"><code>%%timeit

matmul_many(tf.ones([2, 2]))</code></pre>



<pre class="wp-block-code wp-block-output"><code>100 loops, best of 3: 13.5 ms per loop</code></pre>



<p>The first call to <code>graph_fn</code> is slow, since this is when the graph function is generated.</p>



<pre class="wp-block-code"><code>%%time

graph_fn(tf.ones([2, 2]))</code></pre>



<pre class="wp-block-code wp-block-output"><code>CPU times: user 158 ms, sys: 2.02 ms, total: 160 ms
Wall time: 159 ms
&lt;tf.Tensor: id=1530126, shape=(2, 2), dtype=float32, numpy=
array([[1., 1.],
       [1., 1.]], dtype=float32)></code></pre>



<p>But subsequent calls are an order of magnitude faster than imperatively executing <code>matmul_many</code>.</p>



<pre class="wp-block-code"><code>%%timeit

graph_fn(tf.ones([2, 2]))</code></pre>



<pre class="wp-block-code wp-block-output"><code>1000 loops, best of 3: 1.97 ms per loop</code></pre>



<h2 id="comparison-to-other-python-libraries">VII. Comparison to other Python  libraries</h2>



<p>There are many libraries for machine learning. Out of all of them, PyTorch 1.0 is the one that&#8217;s most similar to TF 2.0. Both TF 2.0 and PyTorch 1.0 execute imperatively by default, and both provide ways to transform Python functions into graph-backed functions (compare <code>tf.function</code> and <code><a href="https://pytorch.org/docs/stable/jit.html#">torch.jit</a></code>). The PyTorch JIT tracer, <code>torch.jit.trace</code>, doesn&#8217;t implement the multiple-dispatch semantics that <code>tf.function</code> does, and it also doesn&#8217;t rewrite the AST. On the other hand, <code>TorchScript</code> lets you use Python control flow, but unlike <code>tf.function</code>, it doesn&#8217;t let you mix in arbitrary Python code that parametrizes the construction of your graph. That means that in comparison to <code>tf.function</code>, <code>TorchScript</code> makes it harder for you to shoot yourself in the foot, while potentially limiting your creative expression.</p>



<p>So should you use TF 2.0, or PyTorch 1.0? It depends. Because TF 2.0 is in alpha, it still has some kinks, and its imperative performance still needs work. But you can probably count on TF 2.0 becoming stable sometime this year. If you&#8217;re in industry,  TensorFlow has <a href="https://www.tensorflow.org/tfx/">TFX</a> for production pipelines, <a href="https://www.tensorflow.org/lite">TFLite</a> for deploying to mobile, and <a href="https://www.tensorflow.org/js">TensorFlow.js</a> for the web. PyTorch recently made a <a href="https://pytorch.org/blog/the-road-to-1_0/">commitment to production</a>; since then, they&#8217;ve added <a href="https://pytorch.org/tutorials/advanced/cpp_export.html">C++ inference</a> and deployment solutions for several cloud providers. For research, I&#8217;ve found that TF 2.0 and PyTorch 1.0 are sufficiently similar that I&#8217;m comfortable using either one, and my choice of framework depends on my collaborators.</p>



<p>The multi-stage approach of TF 2.0 is similar to what&#8217;s done in <a href="https://github.com/google/jax/">JAX</a>. JAX is great if  you want a functional programming model that looks exactly like NumPy, but with automatic differentiation and GPU support; this is, in fact, what many researchers want. If you don&#8217;t like functional programming, JAX won&#8217;t be a good fit.</p>



<h2 id="domain-specific-languages-for-machine-learning">VIII. Domain-specific languages for machine learning </h2>



<p>TF 2.0 and PyTorch 1.0 are very unusual libraries. <a href="https://julialang.org/blog/2017/12/ml&amp;pl">It has been observed</a> that these libraries resemble <em>domain-specific languages</em> (DSLs) for automatic-differentiation and machine learning, embedded in Python (see also <a href="https://arxiv.org/pdf/1903.01855.pdf">our paper on TF Eager</a>, TF 2.0&#8217;s precursor). What TF 2.0 and PyTorch 1.0 accomplish in Python is impressive, but they&#8217;re pushing the language to its limits.</p>



<p>There is now significant work underway to embed ML DSLs in languages that are more amenable to compilation than Python, like Swift (<a href="https://arxiv.org/pdf/1711.03016.pdf">DLVM</a>, <a href="https://github.com/tensorflow/swift">Swift for TensorFlow,</a> <a href="https://drive.google.com/file/d/1hUeAJXcAXwz82RXA5VtO5ZoH8cVQhrOK/view">MLIR</a>), and Julia (<a href="https://github.com/FluxML/Flux.jl">Flux</a>, <a href="https://github.com/FluxML/Zygote.jl">Zygote</a>). So while TF 2.0 and PyTorch 1.0 are great libraries, do stay tuned: over the next year (or two, or three?), the ecosystem of programming languages for machine learning will continue to evolve rapidly.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.debugmind.com/2019/04/07/a-primer-on-tensorflow-2-0/feed/</wfw:commentRss>
			<slash:comments>4</slash:comments>
		
		
			</item>
		<item>
		<title>Maine and Potatoes: Approaching Life Like Steinbeck</title>
		<link>https://www.debugmind.com/2016/07/31/life-like-steinbeck/</link>
					<comments>https://www.debugmind.com/2016/07/31/life-like-steinbeck/#comments</comments>
		
		<dc:creator><![CDATA[Akshay Agrawal]]></dc:creator>
		<pubDate>Mon, 01 Aug 2016 03:28:27 +0000</pubDate>
				<category><![CDATA[Miscellaneous]]></category>
		<category><![CDATA[design]]></category>
		<category><![CDATA[Gita]]></category>
		<category><![CDATA[Kierkegaard]]></category>
		<category><![CDATA[purpose]]></category>
		<category><![CDATA[Steinbeck]]></category>
		<category><![CDATA[Travels with Charley]]></category>
		<guid isPermaLink="false">http://www.debugmind.com/?p=2049</guid>

					<description><![CDATA[Per my sister&#8217;s recommendation, I recently picked up Travels with Charley, Steinbeck&#8217;s account1 of a cross-country road trip he took one summer with his beloved Poodle in tow. Steinbeck&#8217;s favorite kind of journey is a meandering one. By his own admission, he&#8217;s &#8220;going somewhere&#8221; but &#8220;doesn&#8217;t greatly care whether&#8221; he arrives2. Reflecting upon a leisurely [&#8230;]]]></description>
										<content:encoded><![CDATA[<p>Per <a href="http://citrusandgold.com/about-me/" target="_blank">my sister&#8217;s</a> recommendation, I recently picked up <em><a href="https://en.wikipedia.org/wiki/Travels_with_Charley" target="_blank">Travels with Charley</a></em>, Steinbeck&#8217;s account<sup><a href="#[1]">1</a></sup> of a cross-country road trip he took one summer with his beloved Poodle in tow.</p>
<p>Steinbeck&#8217;s favorite kind of journey is a meandering one. By his own admission, he&#8217;s &#8220;going somewhere&#8221; but &#8220;doesn&#8217;t greatly care whether&#8221; he arrives<sup><a href="#[2]">2</a></sup>. Reflecting upon a leisurely detour through Maine&#8217;s potato farms, he writes,</p>
<blockquote><p>everything in the world must have design or the human mind rejects it. But in addition it must have purpose or the human conscience shies away from it. Maine was my design, potatoes my purpose.</p></blockquote>
<p>It&#8217;s tempting to interrogate whether your pursuits are meaningful, be they hobbies or careers<sup><a href="#[3]">3</a></sup>. A degree of such interrogation can be constructive: living with intention necessitates a design and a purpose. But indulge too much and you risk descending into a Hamlet-esque, nihilistic spiral that will inevitably derail your pursuit. The last thing you (and certainly I) want is to end up as <a href="http://dbanach.com/absurd%20reasoning.htm" target="_blank">Camus&#8217; strawman</a>, the individual who cannot cope with his discovery that life is without meaning. That Steinbeck&#8217;s design was Maine and his purpose potatoes is a gentle reminder that our own designs and purposes need not be grand. All that we require of them is to exist. </p>
<p>Footnotes<br />
<small><span id="[1]">[1]</span> The introduction to the book&#8217;s 50th anniversary edition cautions readers against taking Steinbeck&#8217;s story too literally, for he was &#8220;a novelist at heart.&#8221; But the book reads truthfully enough and, just as important, entertainingly enough. As author and writing instructor <a href="https://en.wikipedia.org/wiki/John_McPhee" target="_blank">John McPhee</a> joked <a href="http://www.wnyc.org/story/episode-38-wisdom-john-mcphee-and-agony-ipod-lockout/" target="_blank">in an interview </a>with <em>The New Yorker&#8217;s</em> <a href="https://en.wikipedia.org/wiki/David_Remnick" target="_blank">David Remnick</a>, 94 percent accuracy is good enough for creative non-fiction.<br />
<span id="[2]">[2]</span> Approaching our actions with such a sentiment is precisely the <a href="https://en.wikipedia.org/wiki/Bhagavad_Gita" target="_blank"><em>Bhagavad Gita&#8217;s</em></a> prescription for attaining the Good Life. For that matter, it is also the prescription of <a href="https://en.wikipedia.org/wiki/S%C3%B8ren_Kierkegaard" target="_blank">Kierkegaard&#8217;s</a> <a href="https://en.wikipedia.org/wiki/Fear_and_Trembling" target="_blank"><em>Fear and Trembling</em></a>. Both recommend we resign ourselves to the frustration of our desires, but that we do so happily so that we may pursue them nonetheless. If this sounds difficult to you, you&#8217;re not alone; Kierkegaard&#8217;s narrator describes this process as something he cannot hope to understand, though he spends the entire text describing it.<br />
<span id="[3]">[3]</span> Academics at <a href="http://sloanreview.mit.edu/article/what-makes-work-meaningful-or-meaningless/?utm_medium=pr&#038;utm_source=release&#038;utm_campaign=featjune16" target="_blank">MIT&#8217;s Sloan School of Management recently asked 135 people</a> what made their work meaningful. For many, meaningful work is simultaneously &#8220;intensely personal&#8221; and bigger than themselves.</small></p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.debugmind.com/2016/07/31/life-like-steinbeck/feed/</wfw:commentRss>
			<slash:comments>1</slash:comments>
		
		
			</item>
		<item>
		<title>Learning about Learning: Educational Data Mining</title>
		<link>https://www.debugmind.com/2015/09/16/learning-learning-educational-data-mining/</link>
					<comments>https://www.debugmind.com/2015/09/16/learning-learning-educational-data-mining/#comments</comments>
		
		<dc:creator><![CDATA[Akshay Agrawal]]></dc:creator>
		<pubDate>Thu, 17 Sep 2015 04:14:49 +0000</pubDate>
				<category><![CDATA[Miscellaneous]]></category>
		<category><![CDATA[EDM]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[Madrid]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[YouEDU]]></category>
		<guid isPermaLink="false">http://www.debugmind.com/?p=1855</guid>

					<description><![CDATA[Earlier this summer, I crossed the Atlantic and traveled to Madrid to give a talk at the 8th International Conference on Educational Data Mining. I presented a prototype, built by myself and my colleagues at Stanford, that stages intelligent interventions in the discussion forums of Massive Open Online Courses. Our pipeline, dubbed YouEDU, detects confusion [&#8230;]]]></description>
										<content:encoded><![CDATA[<p>Earlier this summer, I crossed the Atlantic and traveled to Madrid to give a talk at the 8th International Conference on Educational Data Mining. I presented a prototype, built by myself and my colleagues at Stanford, that stages intelligent interventions in the discussion forums of Massive Open Online Courses. Our pipeline, dubbed YouEDU, detects confusion in forum posts and recommends instructional video snippets to their presumably confused authors. </p>
<p><div id="attachment_1869" style="width: 790px" class="wp-caption aligncenter"><a href="http://www.debugmind.com/wp-content/uploads/2015/09/IMG_3884-2.jpg"><img aria-describedby="caption-attachment-1869" loading="lazy" src="http://www.debugmind.com/wp-content/uploads/2015/09/IMG_3884-2-1024x232.jpg" alt="EDM took place in Madrid this year. Pictured above the Retiro Pond in Buen Retiro Park. It has nothing to do with EDM, but I enjoyed the park so please enjoy the picture." width="780" height="177" class="size-large wp-image-1869" srcset="https://www.debugmind.com/wp-content/uploads/2015/09/IMG_3884-2-1024x232.jpg 1024w, https://www.debugmind.com/wp-content/uploads/2015/09/IMG_3884-2-300x68.jpg 300w" sizes="(max-width: 780px) 100vw, 780px" /></a><p id="caption-attachment-1869" class="wp-caption-text">The Educational Data Mining Conference took place in Madrid this year. Pictured above the Retiro Pond in Buen Retiro Park. It has nothing to do with EDM. But I enjoyed the park so please enjoy the picture.</p></div></p>
<p><strong>No, not that kind of EDM</strong><br />
Educational Data Mining — affectionately collapsed to EDM — might sound opaque. From the society’s website, EDM is the science and practice of</p>
<blockquote><p>developing methods for exploring the unique and increasingly large-scale data that come from educational settings, and using those methods to better understand students, and the settings which they learn in.
</p></blockquote>
<p>Any educational setting that generates data is a candidate for EDM research. So really any educational setting is a candidate, full stop. In practice, EDM-ers often find themselves focusing their efforts on computer-mediated settings, like tutoring systems, educational games, and MOOCs, perhaps because it’s easy to instrument these systems to leave behind trails of data. </p>
<p>Popular methods applied to these educational settings include student modeling, affect detection, and interventions. Student models attempt to approximate the knowledge that a student possesses about a particular subject, just as a teacher might assess her student, while affect detectors classify the behavior and emotional states of students. Interventions attempt to improve the experience of students at critical times. My own work marries affect detectors with interventions in an attempt to improve MOOC discussion forums.</p>
<p><strong>Making discussion forums smarter<br />
</strong>I became interested in augmenting online education with artificial intelligence a couple of years ago, after listening to a talk at Google and speaking with Peter Norvig. That interest lay dormant for a year, until I began working as a teaching assistant for a Stanford MOOC. I spent a lot of time answering questions in the discussion forum, questions asked by thousands of students. Helping these students was fulfilling work, to be sure. But slogging through a single, unorganized stream of questions and manually identifying urgent ones wasn’t particularly fun. I would have loved an automatically organized inbox of questions. </p>
<p><div id="attachment_1876" style="width: 310px" class="wp-caption alignleft"><a href="http://www.debugmind.com/wp-content/uploads/2015/09/Screen-Shot-2015-09-16-at-8.58.43-PM.png"><img aria-describedby="caption-attachment-1876" loading="lazy" src="http://www.debugmind.com/wp-content/uploads/2015/09/Screen-Shot-2015-09-16-at-8.58.43-PM-300x219.png" alt="The YouEDU architecture. Posts are fed to a classifier that screens posts for confusion, and our recommender then fetches clips relevant to the confused posts." width="300" height="219" class="size-medium wp-image-1876" srcset="https://www.debugmind.com/wp-content/uploads/2015/09/Screen-Shot-2015-09-16-at-8.58.43-PM-300x219.png 300w, https://www.debugmind.com/wp-content/uploads/2015/09/Screen-Shot-2015-09-16-at-8.58.43-PM-1024x747.png 1024w, https://www.debugmind.com/wp-content/uploads/2015/09/Screen-Shot-2015-09-16-at-8.58.43-PM.png 1692w" sizes="(max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-1876" class="wp-caption-text">The YouEDU architecture. Posts are fed to a classifier that screens posts for confusion, and our recommender then fetches clips relevant to the confused posts.</p></div></p>
<p>That these discussion forums were still “dumb”, so to speak, surprised me. I reached out to the platforms team of Stanford Online Learning, who in turn sent me to Andreas Paepcke, a senior research scientist (and, I should add, an incredibly supportive and kind mentor). It turned out that I wasn’t the only one who wished for a more intelligent discussion forum. I paired up with a student of Andreas’ to tackle the problem of automatically classifying posts by the affect or sentiment they expressed.</p>
<p>Our initial efforts at affect detection were circumscribed by the data available to us. Machine learning tasks like ours need human-tagged data — in our case, we needed a dataset of forum posts in which each post was tagged with information about the affect expressed in it. At the time, no such dataset existed. So we created one: the <a href="http://datastage.stanford.edu/StanfordMoocPosts/" target="_blank" rel="noopener noreferrer">Stanford MOOCPosts dataset</a>, available to researchers upon request.</p>
<p>The dataset powered the rest of our work. It enabled us to build a model to predict whether or not a post expressed confusion, as well as a pipeline to recommend relevant clips from instructional videos to the author of that confused post. </p>
<p>YouEDU was not meant to replace teaching assistants in MOOCs. Videos are notoriously difficult to search through (they’re not indexed, like books are), and YouEDU simply helps confused students find content relevant to the topic they’re confused about. Our affect classifiers can also be used outside of YouEDU — for example, they could be used to highlight urgent posts for the instructors, or even for other students in the forum.</p>
<p><em>If you’d like to learn more about our work, you’re welcome to look at the <a href="http://debugmind.com/youedu.pdf" target="_blank" rel="noopener noreferrer">publication</a>, my <a href="http://www.debugmind.com/wp-content/uploads/2015/09/edm2015-slides.pptx" target="_blank" rel="noopener noreferrer">slide deck</a>, or the <a href="https://github.com/akshayka/edxclassify" target="_blank" rel="noopener noreferrer">edxclassify</a> repository.</em></p>
<p><strong>Data mining is not nefarious<br />
</strong>My experience at EDM was a great one. I learned lots from learned people, made lasting friends and memories, and so on. I could talk at length about interesting talks and papers — like Streeter’s <a href="http://www.educationaldatamining.org/EDM2015/uploads/papers/paper_133.pdf" target="_blank" rel="noopener noreferrer">mixture modeling of learning curves</a>, or MacLellan’s slip-aware <a href="http://www.educationaldatamining.org/EDM2015/uploads/papers/paper_163.pdf" target="_blank" rel="noopener noreferrer">bounded logistic regression</a>. But I won’t. You can skim the <a href="http://www.educationaldatamining.org/EDM2015/proceedings/edm2015_proceedings.pdf" target="_blank" rel="noopener noreferrer">proceedings</a> on your own time.</p>
<p>The EDM community is tightly knit, or at least more tightly knit that that of ACM’s Learning @ Scale, the only other education conference I’ve attended. And though no raves were attended, EDM-ers did close the conference by dancing the night away in a bar, after dining, drinking, and singing upon the roof of the <em>Reina Victoria</em>. </p>
<p>Festivities aside, a shared sense of urgency pulsed through the conference. As of late, the public has grown increasingly suspicious of those who collect and analyze data en masse. We see it in popular culture: <em>Ex Machina</em>, for example, with its damning rendition of a Google-like Big Brother who recklessly and dangerously abuses data, captures the sentiment well. The public’s suspicion is certainly justified, but its non-discriminating nature becomes problematic for EDM-ers. The public fears that those analyzing student data are, like <em>Ex Machina&#8217;</em>s tragic genius, either greedy, hoping to manipulate education in order to monetize it, or careless, liable to botch students’ education altogether. For the record, neither is true. EDM researchers are both well-intentioned and competent.</p>
<p>What’s an EDM-er to do? Some at the conference casually floated the idea of rebranding — for example, perhaps they should call themselves educational data <em>scientists</em>, not miners. Perhaps, too, they should write to legislators to convince them that their particular data mining tasks are not nefarious. In a rare example of representative government working as intended, Senator Vitter of Louisiana recently introduced a bill that threatens to cripple EDM efforts. The <a href="https://www.congress.gov/bill/114th-congress/senate-bill/1341" target="_blank" rel="noopener noreferrer"><em>Student Privacy Protection Act</em></a>, a proposed amendment to FERPA, would make it illegal for researchers to, among other things, assess or model psychological states, behaviors, or beliefs. </p>
<p>Were Vitter’s bill to go into effect as law, it would potentially wipe out the entire field of affect modeling. What’s more, the bill would ultimately harm the experience of students enrolled in online courses — as I hope YouEDU shows, students’ online learning experiences can be significantly improved by intelligent systems. </p>
<p>Now, that said, I understand why folks might fear a computational system that could predict behavior. I could imagine a scenario in which an educator mapped predicted affect to different curricula; students who appeared confused would be placed in a slow curricula, while those who appeared knowledgeable would be placed in a faster one. Such tracking would likely fulfill the prophecies of the predictor, creating an artificial and unfortunate gap between the “confused” and “knowledgeable” students. In this scenario, however, the predictive model isn’t inherently harmful to the student’s education. The problem instead lies with the misguided educator. Indeed, consider the following paper-and-pencil equivalent of this situation. Our educational system puts too much stock in tests, a type of predictive tool. Perform poorly on a single math test in the fifth grade and you might be placed onto a slow track, making it even less likely you’ll end up mathematically inclined. Does that mean we should ban tests outright? Probably not. It just means that we should think more carefully about the policies we design around tests. And so it is for the virtual: It is the human abuse of predictive modeling, rather than predictive modeling in and of itself, that we should guard against.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.debugmind.com/2015/09/16/learning-learning-educational-data-mining/feed/</wfw:commentRss>
			<slash:comments>4</slash:comments>
		
		
			</item>
		<item>
		<title>Machines that Learn: Making Distributed Storage Smarter</title>
		<link>https://www.debugmind.com/2014/09/17/machines-learn-making-distributed-storage-smarter/</link>
					<comments>https://www.debugmind.com/2014/09/17/machines-learn-making-distributed-storage-smarter/#respond</comments>
		
		<dc:creator><![CDATA[Akshay Agrawal]]></dc:creator>
		<pubDate>Thu, 18 Sep 2014 01:29:48 +0000</pubDate>
				<category><![CDATA[Miscellaneous]]></category>
		<category><![CDATA[machine learning]]></category>
		<guid isPermaLink="false">http://www.debugmind.com/?p=1833</guid>

					<description><![CDATA[Equipped with shiny machine learning tools, computer scientists these days are optimizing lots of previously manual tasks. The idea is that AI can make certain procedures smarter — we can capitalize on a system’s predictability and implicit structure to automate at least part of the task at hand. For all the progress we’ve made recently [&#8230;]]]></description>
										<content:encoded><![CDATA[<p>Equipped with shiny machine learning tools, computer scientists these days are optimizing lots of previously manual tasks. The idea is that AI can make certain procedures smarter — we can capitalize on a system’s predictability and implicit structure to automate at least part of the task at hand. </p>
<p>For all the progress we’ve made recently in <a href="https://www.wired.com/2014/01/how-to-hack-okcupid/all/" target="_blank" rel="noopener noreferrer">soulmate-searching pipelines</a> and <a href="https://code.edx.org/discern/" target="_blank" rel="noopener noreferrer">essay-grading tools</a>, I haven’t seen too many applications of AI to computer infrastructure.  AI could solve interesting infrastructure problems, particularly when it comes to distributed systems — in a reflexive sort of way, machines can and should use machine learning to learn more about themselves.</p>
<p><strong>Being smart about it: The case for intelligent storage systems</strong><br />
Distributed systems cover a lot of ground; to stop myself from rambling <em>too</em> much, I’ll focus on distributed storage systems here. In these systems, lots of machines work together to provide a transparent storage solution to some number of clients. Different machines often see different workloads — for example, some machines might store particularly hot (i.e., frequently accessed) data, while others might be home to colder data. The variability in workloads matters because particular workloads play better with particular types of storage media. </p>
<p>Manually optimizing for these workloads isn’t feasible. There are just too many files and independent workloads for humans to make good, case-by-case decisions about where files should be stored. </p>
<p>The ideal, then, is a <em>smart</em> storage system. A smart system would automatically adapt to whatever workload we threw at it. By analyzing file system metadata, it would make predictions about files’ eventual usage characteristics and decide where to store them accordingly. If a file looked like it would be hot or short-lived, the smart system could cache it in RAM or flash; otherwise, it could put it on disk. Creating policies with predictive policy would not only minimize IT administrators’ work, but would also boost performance, lowering latency and increasing throughput on average. </p>
<p><strong>From the past, a view into the future: Self-* storage systems</strong><br />
To my surprise, there doesn’t seem to be a whole lot of work in making storage systems smarter. The largest effort I came across was the <em>self-* storage</em> initiative, undertaken by a few faculty over at CMU back in 2003. From their <a href="https://www.debugmind.com/wp-content/uploads/2014/09/Self-Star-Storage-Systems.pdf" target="_blank" rel="noopener noreferrer">white paper</a>,</p>
<blockquote><p>‘self-* storage systems’ [are] self-configuring, self-organizing, self-tuning, self-healing, self-managing systems of storage bricks …, each consisting of CPU(s), RAM, and a number of disks. Designing self-*-ness in from the start allows construction of high-performance, high-reliability storage infrastructures from weaker, less reliable base units … </p></blockquote>
<p>There’s a wealth of interesting content to be found in the self-* papers. In particular, in <a href="https://www.debugmind.com/wp-content/uploads/2014/09/Attribute-Based-File-Prediction.pdf" target="_blank" rel="noopener noreferrer">Attribute-Based File Prediction</a>, the authors propose ways to exploit metadata and information latent in filenames to bucket files into binary classes related to their sizes, access permissions, and lifespans.</p>
<p>Predictions were made using decision trees, which were constructed using the <a href="https://en.wikipedia.org/wiki/ID3_algorithm" target="_blank" rel="noopener noreferrer">ID3</a> algorithm. With the root node corresponding to the entire feature space, ID3 splits the tree into two sub-trees corresponding to the feature that seems like the best predictor (the metric used here is typically <a href="https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence" target="_blank" rel="noopener noreferrer">information gain</a>, but the self-* project used the chi-squared statistic). The algorithm then recursively builds a tree whose leaf nodes correspond to classes. As an aside, it turns out that ID3 tends to overfit training data — <a href="http://isites.harvard.edu/fs/docs/icb.topic539621.files/lec7.pdf">these lecture notes</a> discuss ways to prune decision trees in an attempt to increase their predictive power.</p>
<p><div id="attachment_1838" style="width: 772px" class="wp-caption aligncenter"><a href="https://www.debugmind.com/wp-content/uploads/2014/09/Screen-Shot-2014-09-17-at-6.12.53-PM.png"><img aria-describedby="caption-attachment-1838" loading="lazy" src="https://www.debugmind.com/wp-content/uploads/2014/09/Screen-Shot-2014-09-17-at-6.12.53-PM.png" alt="Diagram from &quot;File Classification in Self-* Storage Systems&quot;, by Ganger, et. al." width="762" height="570" class="size-full wp-image-1838" srcset="https://www.debugmind.com/wp-content/uploads/2014/09/Screen-Shot-2014-09-17-at-6.12.53-PM.png 762w, https://www.debugmind.com/wp-content/uploads/2014/09/Screen-Shot-2014-09-17-at-6.12.53-PM-300x224.png 300w" sizes="(max-width: 762px) 100vw, 762px" /></a><p id="caption-attachment-1838" class="wp-caption-text">Diagram from &#8220;File Classification in Self-* Storage Systems&#8221;, by Ganger, et. al.</p></div></p>
<p>The features used were coarse. For example, files’ basenames were broken into three chunks: prefixes (characters preceding the first period), extensions (characters preceding the last period), and middles (everything in between); directories were disregarded. These simple heuristics proved fairly effective; prediction accuracy didn’t fall below 70 percent.</p>
<p>It’s not clear how a decision tree trained using these same features would perform if more granular predictions were desired, or if the observed filenames were less structured (what if they lacked delimiters?). I could imagine a much richer feature set for filenames; possible features might include the number of directories, the ratio of numbers to characters, TTLs, etc.  </p>
<p><strong>From research to reality: Picking up where self-* left off</strong><br />
The self-* project was an ambitious one — the researchers planned to launch a large scale implementation of it called Ursa Major, which would offer 100s of terabytes of automatically tuned storage to CMU researchers.</p>
<p>I recently corresponded with CMU professor Greg Ganger, who led the self-* project. It turns out that Ursa Major never fully materialized, though significant and practical progress in smart storage systems was made nonetheless. That the self-* project lives no longer doesn’t mean that idea of smart storage systems should die, too. The onus lies with us to pick up the torch, and to continue where the folks at CMU left off.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.debugmind.com/2014/09/17/machines-learn-making-distributed-storage-smarter/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>A Small Glass Box</title>
		<link>https://www.debugmind.com/2014/08/28/small-glass-box/</link>
					<comments>https://www.debugmind.com/2014/08/28/small-glass-box/#respond</comments>
		
		<dc:creator><![CDATA[Akshay Agrawal]]></dc:creator>
		<pubDate>Thu, 28 Aug 2014 07:34:29 +0000</pubDate>
				<category><![CDATA[Miscellaneous]]></category>
		<guid isPermaLink="false">http://www.debugmind.com/?p=1762</guid>

					<description><![CDATA[I took a trip up to San Francisco’s Exploratorium, some two weeks past. Though recently relocated, the Exploratorium is comfortably familiar. It’s still packed with exhibits that span the spectrum from mystically enchanting (one station lets museum-goers create delicate purple auroras that warp and spiral in a glass tube) to delightfully curious (another rapidly spins [&#8230;]]]></description>
										<content:encoded><![CDATA[<p>I took a trip up to San Francisco’s Exploratorium, some two weeks past. Though recently relocated, the Exploratorium is comfortably familiar. It’s still packed with exhibits that span the spectrum from mystically enchanting (one station lets museum-goers create delicate purple auroras that warp and spiral in a glass tube) to delightfully curious (another rapidly spins dozens of Lego Batmen and dolphins, making them dance to the tune of the Caped Crusader’s catchy theme song). </p>
<p><div id="attachment_1767" style="width: 615px" class="wp-caption aligncenter"><a href="http://www.debugmind.com/wp-content/uploads/2014/08/Tinkering.jpg"><img aria-describedby="caption-attachment-1767" loading="lazy" src="http://www.debugmind.com/wp-content/uploads/2014/08/Tinkering.jpg" alt="Exhibits at this unconventional museum are designed to stir your curiosity. It's hard to resist playing with them, but of course there's no need to — almost everything is hands-on. Photo by Sara Yang. " width="605" height="328" class="size-full wp-image-1767" srcset="https://www.debugmind.com/wp-content/uploads/2014/08/Tinkering.jpg 605w, https://www.debugmind.com/wp-content/uploads/2014/08/Tinkering-300x162.jpg 300w" sizes="(max-width: 605px) 100vw, 605px" /></a><p id="caption-attachment-1767" class="wp-caption-text">Exhibits at this unconventional museum are designed to stir your curiosity. It&#8217;s hard to resist playing with them, but of course there&#8217;s no need to — almost everything is hands-on. Photo by Sara Yang.</p></div></p>
<p>I meandered through the museum, all the while searching for a particular treasure. Just before the closing bells rung, I stumbled upon it: the cloud chamber, a large, humming, refrigerated box with a sky-facing window that allows for the observation of cosmic radiation. Cosmic rays hail from the beyond the solar system. They collide in the earth’s atmosphere, and minuscule particles rain torrentially upon us in the aftermath. The cloud chamber makes an otherwise imperceptible and invisible downpour from the heavens palpably visible, if only for a fleeting moment. </p>
<p><div id="attachment_1770" style="width: 250px" class="wp-caption alignleft"><a href="http://www.debugmind.com/wp-content/uploads/2014/08/supplies-big.jpg"><img aria-describedby="caption-attachment-1770" loading="lazy" src="http://www.debugmind.com/wp-content/uploads/2014/08/supplies-small.jpg" alt="Our homemade cloud chamber consists of a small box with a lid lined with black felt. In order to nudge muons into uncloaking themselves, we douse the felt with isopropanol and heat it from above with my desk lamp." width="240" height="309" class="size-full wp-image-1770" srcset="https://www.debugmind.com/wp-content/uploads/2014/08/supplies-small.jpg 240w, https://www.debugmind.com/wp-content/uploads/2014/08/supplies-small-233x300.jpg 233w" sizes="(max-width: 240px) 100vw, 240px" /></a><p id="caption-attachment-1770" class="wp-caption-text">Our homemade cloud chamber consists of a small box with a lid lined with black felt. In order to nudge muons into uncloaking themselves, we douse the felt with isopropanol and heat it from above with my desk lamp.</p></div></p>
<p>The sight brought me back four years, to the first time I saw muons zip hither and thither through the same chamber. I had spent the better part of that year in my garage, tinkering with a friend of mine by the name of Hemanth on our own chamber for a science project. </p>
<p>On a nostalgic whim, I called up Hemanth the next day. We decided to fire up the chamber once again, for old times’ sake. We scrounged the necessary components, lugged them to Hemanth’s garage, and got started. Pulverizing dry ice, we began working to the sound of snow crunching underfoot and the sight of fumes eddying about. </p>
<p><div id="attachment_1780" style="width: 312px" class="wp-caption alignright"><a href="http://www.debugmind.com/wp-content/uploads/2014/08/dry-ice-large.jpg"><img aria-describedby="caption-attachment-1780" loading="lazy" src="http://www.debugmind.com/wp-content/uploads/2014/08/dry-ice-small1.jpeg" alt="With thick gloves and sturdy hammers, we first crush the dry ice into a coarse powder and pack it tightly into a Styrofoam base, on top of which the chamber sits. The one-two punch of a cooling source and a heating source forces the alcohol into a supersaturated, supercooled state. Muons streaking through the chamber rip electrons off the vapor, causing water molecules to visibly condense around their paths." width="302" height="189" class="size-full wp-image-1780" srcset="https://www.debugmind.com/wp-content/uploads/2014/08/dry-ice-small1.jpeg 302w, https://www.debugmind.com/wp-content/uploads/2014/08/dry-ice-small1-300x187.jpeg 300w" sizes="(max-width: 302px) 100vw, 302px" /></a><p id="caption-attachment-1780" class="wp-caption-text">With thick gloves and sturdy hammers, we first crush the dry ice into a coarse powder and pack it tightly into a Styrofoam base, on top of which the chamber sits. The one-two punch of a cooling source and a heating source forces the alcohol into a supersaturated, supercooled state. Muons streaking through the chamber rip electrons off the vapor, causing water molecules to visibly condense around their paths.</p></div></p>
<p>I followed our procedure as if on autopilot; my mind wandered and let bittersweet memories leak. We packed the dry ice into a foam base (days colored by failed prototype runs), doused the chamber with isopropanol (afternoons brightened by faint flashes of muons), and positioned my lamp atop the box (nights illuminated by the bluish glow of computer monitors).</p>
<p><div id="attachment_1793" style="width: 410px" class="wp-caption aligncenter"><a href="http://www.debugmind.com/wp-content/uploads/2014/08/viewing-chamber-2.jpg"><img aria-describedby="caption-attachment-1793" loading="lazy" src="http://www.debugmind.com/wp-content/uploads/2014/08/viewing-chamber-2.jpg" alt="Our small glass box held us rapt, as we saw the ghosts of muons pass through it. Unfortunately, the streaks are difficult to capture on camera." width="400" height="404" class="size-full wp-image-1793" srcset="https://www.debugmind.com/wp-content/uploads/2014/08/viewing-chamber-2.jpg 400w, https://www.debugmind.com/wp-content/uploads/2014/08/viewing-chamber-2-297x300.jpg 297w" sizes="(max-width: 400px) 100vw, 400px" /></a><p id="caption-attachment-1793" class="wp-caption-text">Our small glass box held us rapt, as we saw the ghosts of muons pass through it. Unfortunately, the streaks are difficult to capture on camera.</p></div></p>
<p>We left the chamber to run for some time. When we returned, muons were streaking visibly through it. Spellbound, we lingered by the chamber for over half an hour. Four years ago, an anxious desire to create something novel and a preoccupation with results left little room for wonder. Now, we could stare into the cloud chamber for but the simple sake of doing so. The muons that passed through it, falling like delicate strands of spider web, were, paradoxically, both otherworldly and earthly. Our small glass box, glued together by a mom-and-pop craft shop, had become a window into the universe’s secrets. The sight was as humbling as it was beautiful. </p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.debugmind.com/2014/08/28/small-glass-box/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>

<!--
Performance optimized by W3 Total Cache. Learn more: https://www.boldgrid.com/w3-total-cache/

Object Caching 7/140 objects using disk
Page Caching using disk: enhanced 

Served from: www.debugmind.com @ 2026-04-12 12:08:02 by W3 Total Cache
-->