<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Manifold</title>
	<atom:link href="http://www.stumblingfool.org/blog/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://www.stumblingfool.org/blog</link>
	<description>Things I&#039;m working through</description>
	<lastBuildDate>Thu, 05 Jan 2012 18:34:56 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.3</generator>
		<item>
		<title>Making with atoms, part I</title>
		<link>http://www.stumblingfool.org/blog/?p=227</link>
		<comments>http://www.stumblingfool.org/blog/?p=227#comments</comments>
		<pubDate>Thu, 05 Jan 2012 18:34:56 +0000</pubDate>
		<dc:creator>CQM</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.stumblingfool.org/blog/?p=227</guid>
		<description><![CDATA[I have a Lumenlabs micro CNC machine which was purchased by my brother for some prototyping work. It turned out that I was more able to tinker with it than he was, and so I ended up taking it off his hands. I have the thing in working order, and while it didn't take quite [...]]]></description>
			<content:encoded><![CDATA[<p>I have a <a href="http://www.lumenlabs.com">Lumenlabs</a> micro CNC machine which was purchased by my brother for some prototyping work.  It turned out that I was more able to tinker with it than he was, and so I ended up taking it off his hands.  I have the thing in working order, and while it didn't take quite as much tinkering as we originally thought it would, it has taken a bit of work.  Of course, my skills as a machinist are extremely limited, so the whole thing has been (and continues to be) a learning experience.</p>
<p>When he bought the micro, my brother was intending to use it to make prototypes for our <a href="http://www.thornwerks.com">everyday carry pen and stylus</a>.  So far, that hasn't worked out. Most significantly, the design we're going with would be more appropriately fashioned on a lathe (or at least a 4 axis mill; the micro is a 3 axis machine) but also, the micro (as shipped) has a couple of deficiencies that make it hard to use for serious work. More about those in a minute.  The last piece of the picture has to do with my (lack of) skills, and the learning curve I've encountered in working with this device.  In that regard, the micro's deficiencies haven't hurt us that much, as they've been something I can learn from.</p>
<p>So, deficiencies.  As far as my brother was concerned, the biggest problem was in the realm of documentation and customer support.  Getting the wiring and the computer interface set up properly required a little digging.  All of the information is supplied by Lumenlabs, but it's behind a pay wall.  The purchase of the machine grants access for a year (in theory) but that wasn't made clear to us at first, and so we did a lot of stumbling around until we managed to get everything hooked up.  That part didn't bother me as much: I figure a machine made for hobbyists has a certain amount of tweaking built into its design.  My issues rest with some missing features of the machine.</p>
<p>First, positioning.  After installing the limit switches, the machine zeros itself, and can locate positions in x and y with relative ease.  The z location, however, is set at the top, rather than the bottom, and so zeroing in z doesn't give you a clear idea of the height of the end of the tool above the table.  Furthermore, any time you change a tool, this (unknown) height will change in an unknown and probably unpredictable way.  Also, moving too quickly in z will cause the motor to bind a little, and the z position gets lost.  I don't know if the binding is a result of the design, or if it's an alignment issue that I can fix, or if some part of the z axis on our machine is bent or otherwise damaged.  I haven't tried too hard to fix it, because the workaround (don't move too fast in z) is reasonably easy.  I think that height detection/touch off is doable (and in fact, I've read a bit about people who are into pcb milling doing some fancy things with bilinear interpolation to get a good height map of the work piece) so that's on my list of projects.  In any event, I can touch off manually, and have a pretty good idea of where the end of the tool is.  So this is really more of an inconvenience than anything else.</p>
<p>The other, perhaps more serious limitation of my current hardware is that my spindle only has one speed, and it's too fast.  Which is to say, it works fine in wood and plastic, but I run into serious chatter problems when milling aluminum, and I wouldn't imagine trying to do steel.  I have three or so options here.  The simplest would be to try a variac (... if I only had access to the stuff in my parents' basement!).  Next, I could buy either just the speed control or a whole packaged flex shaft grinder; though I don't have any guarantee that a speed control for another grinder would work with mine.  Last, I found a circuit schematic online that I could build.  That solution has a certain appeal, but would be most time consuming.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.stumblingfool.org/blog/?feed=rss2&#038;p=227</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>django organization and templates</title>
		<link>http://www.stumblingfool.org/blog/?p=219</link>
		<comments>http://www.stumblingfool.org/blog/?p=219#comments</comments>
		<pubDate>Mon, 19 Dec 2011 22:13:23 +0000</pubDate>
		<dc:creator>CQM</dc:creator>
				<category><![CDATA[django]]></category>

		<guid isPermaLink="false">http://www.stumblingfool.org/blog/?p=219</guid>
		<description><![CDATA[I'm building a django project to manage the research I'm doing on optimization. (I'll update this with a link when it's functional enough that a link would be useful). The project consists of a number of apps, each with their own models, views, and templates. I want the whole project to have the same look [...]]]></description>
			<content:encoded><![CDATA[<p>I'm building a django project to manage the research I'm doing on optimization.  (I'll update this with a link when it's functional enough that a link would be useful).  The project consists of a number of apps, each with their own models, views, and templates.  I want the whole project to have the same look and feel, though, so I'd like all of the app templates to use a project-wide css file, and to extend a project-wide base template.  Here's what I've tried, and how it works.</p>
<p>For simplicity, let's talk about a project (called optimization) with two apps (called code and experiments).  I'd like the relevant parts of the directory structure to look like this:<br />
<code><br />
optimization/<br />
----code/<br />
--------templates/<br />
------------code_base.html<br />
------------code_module.html<br />
------------code_module_list.html<br />
----experiments/<br />
--------templates/<br />
------------exp_base.html<br />
------------exp_setup.html<br />
------------exp_results.html<br />
----templates/<br />
--------base.html<br />
</code></p>
<p>In this scenario, <code>code_module.html</code> extends </code>code_base.html</code>, which extends <code>base.html</code>.  The project-wide base template <code>base.html</code> contains the site-wide navigation, css, and javascript, as well as the general layout.  Each app base template modifies the navigation to indicate which section of the site the user is viewing, and sets up the basis of interaction with the data relevant to that app by defining a default-for-this-app content block.</p>
<p>So far so good; this kind of modularity is explicitly within the design philosophy of django, and is actually not that hard to accomplish.  The key steps are:</p>
<ol>
<li>Include <code>optimization/templates</code> in the <code>TEMPLATE_DIRS</code> variable, and</li>
<li>Include <code>django.template.loaders.app_directories.Loader</code> in <code>TEMPLATE_LOADERS</code></li>
</ol>
<p>That's essentially it.  There are two more caveats, though (or at least, two that I've run up against so far):</p>
<ol>
<li>Use absolute, rather than relative paths when adding directories to <code>TEMPLATE_DIRS</code>.  This may require a difference between development and production settings, so keep an eye on that.</li>
<li>If you're using generic views, the default template location will be <code>appname/templatename.html</code>. Since I wanted to keep the templates for each app under the template directory of that app, this meant that django was looking for the templates in, for example <code>optimization/codebase/templates/codebase/</code>, which offended my sense of aesthetics.  It's reasonably easy to get around this, though, by passing the actual location of the desired template to your generic view in <code>urls.py</code>.</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://www.stumblingfool.org/blog/?feed=rss2&#038;p=219</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Quiver plots in octave</title>
		<link>http://www.stumblingfool.org/blog/?p=193</link>
		<comments>http://www.stumblingfool.org/blog/?p=193#comments</comments>
		<pubDate>Thu, 01 Sep 2011 21:13:07 +0000</pubDate>
		<dc:creator>CQM</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.stumblingfool.org/blog/?p=193</guid>
		<description><![CDATA[As part of the Electricity and Magnetism course I'm teaching this semester, I assigned my students the task of making some vector field (quiver) plots. I told them they could use either Maxima or Octave, but that it is much easier to do in Octave. That's the subject of this post. The Process The general [...]]]></description>
			<content:encoded><![CDATA[<p>As part of the Electricity and Magnetism course I'm teaching this semester, I assigned my students the task of making some vector field (quiver) plots. I told them they could use either Maxima or Octave, but that it is much easier to do in Octave.  That's the subject of this post.</p>
<h3>The Process</h3>
<p>The general process for producing a quiver plot has 3 steps:</p>
<ol>
<li>Specify the locations for the vectors</li>
<li>Define the vector function in terms of the locations</li>
<li>Plot</li>
</ol>
<p>This process would be the same regardless of the tool you're using, but since octave was made with matrix operations in mind, there are built-in functions that largely take care of these operations for you.  I'm not aware of either built-ins or add-on packages for maxima that do the same (though I plan to explore that in a future post), and that's why I say this activity is easier in octave than in maxima.</p>
<p>It will be easier to see how this works with a specific example, so let's run through one. Since the course is on E+M, it makes sense to start with the field due to a point charge located at the origin.  If the point charge is negative, like an electron, then the field points radially inward, and is proportional to the inverse square of the distance from the origin.  Said with math,<br />
<p style='text-align:center;'><span class='MathJax_Preview'><img src='http://www.stumblingfool.org/blog/wp-content/plugins/latex/cache/tex_c9480e9bbfa5077e3cccf7ca9ac2c338.gif' style='' class='tex' alt=" {\bf E} \propto -{1\over r^2}{\bf \hat r} " /></span><script type='math/tex' mode='display'> {\bf E} \propto -{1\over r^2}{\bf \hat r} </script></p> We'll take the constant of proportionality to be 1; this just amounts to choosing the magnitude of charge carefully.</p>
<p>It is easiest (at first) to use cartesian coordinates for plotting, so let's write this in terms of x and y:<br />
<p style='text-align:center;'><span class='MathJax_Preview'><img src='http://www.stumblingfool.org/blog/wp-content/plugins/latex/cache/tex_7091dfa3aa9d1b521ac00ce41789b5e7.gif' style='' class='tex' alt="{\bf E} = {-x {\bf \hat x} - y {\bf \hat y}\over (x^2 + y^2)^{3/2}} " /></span><script type='math/tex' mode='display'>{\bf E} = {-x {\bf \hat x} - y {\bf \hat y}\over (x^2 + y^2)^{3/2}} </script></p><br />
Thus we have the x and y components of E<br />
<p style='text-align:center;'><span class='MathJax_Preview'><img src='http://www.stumblingfool.org/blog/wp-content/plugins/latex/cache/tex_5e0d7da9038d056bd0d4112545a4593c.gif' style='' class='tex' alt=" E_x = {-x \over (x^2 + y^2)^{3/2}} " /></span><script type='math/tex' mode='display'> E_x = {-x \over (x^2 + y^2)^{3/2}} </script></p> and <p style='text-align:center;'><span class='MathJax_Preview'><img src='http://www.stumblingfool.org/blog/wp-content/plugins/latex/cache/tex_2a8ccd1203ee70059dbf67cbe3250387.gif' style='' class='tex' alt=" E_y = {-y \over (x^2 + y^2)^{3/2}}. " /></span><script type='math/tex' mode='display'> E_y = {-y \over (x^2 + y^2)^{3/2}}. </script></p></p>
<p>Now, we want to plot this over a reasonable range, say [-2,2] in both x and y. This brings us to step 1: define the points for our plot.  We don't (yet) have any reason to use anything other than a regular cartesian grid, so that's what we'll use.  We don't yet know, however, how far apart to put the points.  To start with, let's choose a fairly sparse set of points; we don't want to overdo it.  We can create arrays of x and y coordinates for our points with the <code>meshgrid</code> command, like so:<br />
<code><br />
octave-3.4.0:44> [xx,yy] = meshgrid([-2:0.4:2]);<br />
</code><br />
This puts a 2d array in each of the variables xx and yy, with the x coordinates of our points in xx, and the y coordinates in yy.  Meshgrid is actually more versatile than this, as we'll see in a minute, but this is enough to get us going.</p>
<p>With the points defined, now we need to define the components of our vector field.  The mathematical expressions for each component are listed above; in octave:<br />
<code><br />
octave-3.4.0:45> Ex = -xx./(xx.^2+yy.^2).^(3/2);<br />
octave-3.4.0:46> Ey = -yy./(xx.^2+yy.^2).^(3/2);<br />
</code><br />
As a reminder, we need to use the elementwise operators for division and exponentiation here, or else we get something we don't expect.</p>
<p>Finally, the plot:<br />
<code><br />
octave-3.4.0:47> quiver(xx,yy,Ex,Ey);<br />
</code><br />
This results in the plot:<br />
<a href="http://www.stumblingfool.org/blog/wp-content/uploads/2011/09/quiver1.png"><img src="http://www.stumblingfool.org/blog/wp-content/uploads/2011/09/quiver1-300x225.png" alt="quiver plot" title="quiver1" width="300" height="225" class="alignleft size-medium wp-image-195" /></a><br />
This plot has a number of problems, and our next step will be to fix them, but it serves to illustrate the process.</p>
<h3>Fixing problems</h3>
<p>While this plot does show the direction and relative magnitude of the field throughout the plotting domain, there are a number of things about it which make it less than optimal.  We can improve incrementally, and look at the intermediate results along the way.</p>
<h4>Vector overlap at the center</h4>
<p>The most grievous sin this plot commits is the overlap of the vectors at the center.  Our eyes are naturally drawn to the heads, rather than the tails, of the arrows, and so when they cross over each other, it's confusing.  You might think this could be solved by using some overall multiplicative factor in our definitions for Ex and Ey, but it turns out that doesn't help.  Unless instructed otherwise, octave autoscales the arrows, and this is usually a good thing.  In this case, though, we want to shrink the arrows just enough so that they don't overlap in the middle.<br />
<a href="http://www.stumblingfool.org/blog/wp-content/uploads/2011/09/quiver2.png"><img src="http://www.stumblingfool.org/blog/wp-content/uploads/2011/09/quiver2-300x225.png" alt="" title="quiver2" width="300" height="225" class="alignleft size-medium wp-image-198" /></a>Fortunately, the <code>quiver</code> command has a scale option.  Exactly how much to scale the vectors is largely a question of taste, and some trial and error may be involved.  For this problem, I scaled the vectors to 3/8 of their original length, and that gave results I find acceptable:</p>
]]></content:encoded>
			<wfw:commentRss>http://www.stumblingfool.org/blog/?feed=rss2&#038;p=193</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Resonance, the PRC, and adaptation</title>
		<link>http://www.stumblingfool.org/blog/?p=172</link>
		<comments>http://www.stumblingfool.org/blog/?p=172#comments</comments>
		<pubDate>Sat, 01 Jan 2011 20:33:24 +0000</pubDate>
		<dc:creator>CQM</dc:creator>
				<category><![CDATA[Sleep]]></category>

		<guid isPermaLink="false">http://www.stumblingfool.org/blog/?p=172</guid>
		<description><![CDATA[Recently on the polyphasic google group, someone posed the question: Are there accounts of people who've attempted to adapt to any of these low-sleep schedules with what's generally agreed upon as the correct steps and still didn't adapt, or all evidence indicates anyone can theoretically adapt? A few months earlier (from this posting), I tried [...]]]></description>
			<content:encoded><![CDATA[<p>Recently on the polyphasic google group, someone posed the question:</p>
<blockquote><p>
Are there accounts of people who've attempted to adapt to any of these low-sleep schedules with what's generally agreed upon as the correct steps and still didn't adapt, or all evidence indicates anyone can theoretically adapt?</p>
<p>A few months earlier (from this posting), I tried forcing myself on ~6 hours monophasic sleep a day, largely based on an alleged second-hand story of a guy forcing himself to do 4 hours sleep, claiming to adapt after a month.  I stopped after 5 weeks when there was no long-term improvement in wakefulness.  I'll hopefully not do such experiments based on flimsy evidence next time <img src='http://www.stumblingfool.org/blog/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />   I figure now that the story's either not true or that there was something special about the guy's body, the latter of which leads me to inquire on the subject in relation to polyphasic sleepers.
</p></blockquote>
<p>While I don't think there is a definitive answer to this question, I think we can gain some insight into what might be going on by examining how the body adapts to changes in the external clock (for example, at daylight savings time, or when traveling across time zones).  The body's internal clock can be thought of as an oscillator with its own natural frequency (typically in the neighborhood of, but not exactly 24 hours).  This frequency varies from person to person.  How is it, then, that we stay in sync with the 24 hour day? Well, the circadian clock gets a kick every time we see bright light (among other things) which brings the frequency into harmony with its external stimulus (in this case, the rising and setting of the sun)[1].</p>
<p>How much the internal clock changes, and in which direction, depends on when the person is exposed to the bright light.  A simple analogy helps us to understand why this is so:  if you are pushing a child on a swing, you can make the child go higher by pushing at just the right time.  If you push at the wrong time, however, it is possible to damp out the swing's motion.  For example, if you push on the child while she is moving toward you, you will reduce the amplitude of the motion, whereas a push when the child is moving away will tend to increase the amplitude.</p>
<p>The analogy is not perfect; the circadian clock is not a simple oscillator like a swing.  The basic principle still holds, though, which is that the response of the clock to an outside stimulus depends on exactly when that stimulus is applied.  The mathematical relationship between the time of the cycle when the external stimulus is applied and its result is called the phase response curve, or PRC.</p>
<p>More broadly, in physics this kind of phenomenon is referred to as resonance.  The idea is that a given system will have one or more natural modes of operation, and if one wants to put energy into or take energy out of the system, it is easiest to do near to one of these natural modes.  </p>
<p>What does this have to do with the question at the top of this post?  Perhaps nothing.  My guess, however, is that each of the three major things which affects one's ability to adapt to a new sleep schedule (circadian phase, sleep effectiveness, and sleep/wake thresholds) has some sort of variable response curve to those things which affect it.  If I want to sleep less, I need to increase my sleep effectiveness, and the details of how I go about trying to do that are probably important.  In particular, a gradual transition may be easier than an abrupt one.  Also (and this may be an even stronger effect) when you put your 6 (or 4, or whatever) hours of sleep in your day can make a big difference in how you feel, and likely in to what extent and how quickly you adjust to the new sleep schedule.</p>
<p>As far as I'm aware, the experiments to determine whether what I've been saying in this post is actually true or not haven't been done, and I'm not the guy to design or perform them (I'm a computational physicist, not a physiological psychologist). If I had to guess, though, and I didn't have any more information than was originally given, I would say that the questioner's problem in adapting to a 6 hour monophasic schedule stemmed from some combination of trying to take the 6 hours of sleep at the wrong time of day, and not using an appropriate adaptation schedule to get from 8 hours to 6.</p>
<h4>References</h4>
<div class="papercite_entry">[1]                   J. Duffy and K. W. Jr, "Entrainment of the human circadian system by light," <span style="font-style: italic">Journal of Biological Rhythms</span>, vol. 20, iss. 4, pp. 326, 2005. <br/>    <a href="javascript:void(0)" id="papercite_1" class="papercite_toggle">[Bibtex]</a></div>
<pre class="papercite_bibtex" id="papercite_1_block"><code>@article{Duffy:2005p184,
author = {JF Duffy and KP Wright Jr},
journal = {Journal of Biological Rhythms},
title = {Entrainment of the human circadian system by light},
number = {4},
pages = {326},
volume = {20},
year = {2005},
date-added = {2010-08-27 18:58:04 -0400},
date-modified = {2010-08-27 19:07:13 -0400},
pmid = {1820236591755455448related:2GvAQonHQhkJ},
local-url = {file://localhost/Users/cmckay/Documents/Papers/2005/Duffy/Journal%20of%20Biological%20Rhythms%202005%20Duffy.pdf},
uri = {papers://FA38B055-1173-4624-8A1D-F12C6F8B1027/Paper/p184},
rating = {0}
}</code></pre>
]]></content:encoded>
			<wfw:commentRss>http://www.stumblingfool.org/blog/?feed=rss2&#038;p=172</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tool for viewing sleep data</title>
		<link>http://www.stumblingfool.org/blog/?p=146</link>
		<comments>http://www.stumblingfool.org/blog/?p=146#comments</comments>
		<pubDate>Sat, 20 Nov 2010 14:27:23 +0000</pubDate>
		<dc:creator>CQM</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.stumblingfool.org/blog/?p=146</guid>
		<description><![CDATA[The fine folks at Zeo invited me to participate in a polyphasic sleep experiment, and of course I was thrilled to do it. I've put together a little tool in python for parsing the output from their hardware. First, the pretty pictures: I'm nearly satisfied with the results (though I don't understand where the doubling [...]]]></description>
			<content:encoded><![CDATA[<p>The fine folks at <a href=http://myzeo.com>Zeo</a> invited me to participate in a polyphasic sleep experiment, and of course I was thrilled to do it.  I've put together a little tool in python for parsing the output from their hardware.  First, the pretty pictures:<br />
<div id="attachment_147" class="wp-caption alignnone" style="width: 310px"><a href="http://www.stumblingfool.org/blog/wp-content/uploads/2010/11/histogram.png"><img src="http://www.stumblingfool.org/blog/wp-content/uploads/2010/11/histogram-300x225.png" alt="" title="sleep stage histogram" width="300" height="225" class="size-medium wp-image-147" /></a><p class="wp-caption-text">Histogram for 2 weeks of monophasic sleep data</p></div><br />
<div id="attachment_148" class="wp-caption alignnone" style="width: 310px"><a href="http://www.stumblingfool.org/blog/wp-content/uploads/2010/11/offset-hist.png"><img src="http://www.stumblingfool.org/blog/wp-content/uploads/2010/11/offset-hist-300x225.png" alt="" title="offset-hist" width="300" height="225" class="size-medium wp-image-148" /></a><p class="wp-caption-text">Histogram of times relative to sleep onset.</p></div><br />
<div id="attachment_150" class="wp-caption alignnone" style="width: 310px"><a href="http://www.stumblingfool.org/blog/wp-content/uploads/2010/11/hypnogram.png"><img src="http://www.stumblingfool.org/blog/wp-content/uploads/2010/11/hypnogram-300x225.png" alt="" title="hypnogram" width="300" height="225" class="size-medium wp-image-150" /></a><p class="wp-caption-text">Sleep stage hypnogram</p></div><br />
<div id="attachment_151" class="wp-caption alignnone" style="width: 310px"><a href="http://www.stumblingfool.org/blog/wp-content/uploads/2010/11/offset-hypno.png"><img src="http://www.stumblingfool.org/blog/wp-content/uploads/2010/11/offset-hypno-300x225.png" alt="" title="offset-hypno" width="300" height="225" class="size-medium wp-image-151" /></a><p class="wp-caption-text">Hypnogram with times relative to the start of data collection</p></div></p>
<p>I'm nearly satisfied with the results (though I don't understand where the doubling up on the legend in the first picture came from).  There are a couple of things which could probably still be tweaked.  One example: I'd like the bars on the hypnograms to be centered on the date labels which correspond to them.</p>
<p>Zeo has made a library available to directly decode the data stored on the SD card in their device, but it's written in Java, and I don't do Java.  So, I used their website to convert the data to CSV, and then parsed the CSV file with my script.  It is no doubt much less efficient than it could be; I just wanted something which I could put together pretty quickly and would give me decent output.  Also, I've only been using python for about 3 weeks, so I'm sure the coding style is not great.</p>
<p>Code can be downloaded <a href="http://www.stumblingfool.org/blog/wp-content/downloads/sleepplots.txt">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.stumblingfool.org/blog/?feed=rss2&#038;p=146</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>In which the optimizer eats itself, part I</title>
		<link>http://www.stumblingfool.org/blog/?p=110</link>
		<comments>http://www.stumblingfool.org/blog/?p=110#comments</comments>
		<pubDate>Mon, 15 Nov 2010 16:35:03 +0000</pubDate>
		<dc:creator>CQM</dc:creator>
				<category><![CDATA[Optimization]]></category>

		<guid isPermaLink="false">http://www.stumblingfool.org/blog/?p=110</guid>
		<description><![CDATA[A vanilla PSO algorithm has a handful of tunable parameters which affect the convergence rate. For our purposes, we're going to consider the set \{ \omega, c_1, c_2, n_{\mathrm{particles}} \} to be our set of adjustable parameters. It is certainly true that the bounds on both position and velocity for each particle will affect convergence, [...]]]></description>
			<content:encoded><![CDATA[<p>A vanilla PSO algorithm has a handful of tunable parameters which affect the convergence rate.  For our purposes, we're going to consider the set<br />
<p style='text-align:center;'><span class='MathJax_Preview'><img src='http://www.stumblingfool.org/blog/wp-content/plugins/latex/cache/tex_73e5f3aa3555e4dae47c687af1f563ff.gif' style='' class='tex' alt=" \{ \omega, c_1, c_2, n_{\mathrm{particles}} \}" /></span><script type='math/tex' mode='display'> \{ \omega, c_1, c_2, n_{\mathrm{particles}} \}</script></p></p>
<p>to be our set of adjustable parameters.  It is certainly true that the bounds on both position and velocity for each particle will affect convergence, but including those would in effect be saying "we converge quickly if we guess well," and is therefore considered cheating.  It is possible that we could isolate a relationship between the size of the position space and the optimum size of the velocity space, but we won't investigate that just yet.</p>
<p>As a reminder, the velocity update formula for vanilla PSO is<br />
<p style='text-align:center;'><span class='MathJax_Preview'><img src='http://www.stumblingfool.org/blog/wp-content/plugins/latex/cache/tex_8e4459196c212b2ab1b40dbc6d3e919d.gif' style='' class='tex' alt=" v_i^{\mathrm{new}} = \omega v_i^{\mathrm{old}} + c_1 r_1 (x_i^{p} - x_i) + c_2 r_2 (x_i^{g}-x_i), " /></span><script type='math/tex' mode='display'> v_i^{\mathrm{new}} = \omega v_i^{\mathrm{old}} + c_1 r_1 (x_i^{p} - x_i) + c_2 r_2 (x_i^{g}-x_i), </script></p></p>
<p>where the subscript i denotes the dimension/direction, the superscript p denotes the personal best location, and the superscript g denotes the global best location.  The two values r1 and r2 are random numbers between 0 and 1.  Thus omega determines the particle's tendency to maintain the same direction of travel in parameter space; c1 gives the influence of memory (the personal best), and c2 gives the influence of the rest of the population.  Essentially, we want to know how to balance the three tendencies to get the fastest convergence to the best value.  Additionally, we're interested in knowing the optimum population size.</p>
<p>In the literature, there are often references to "reasonable" sets of parameters. Presumably, these are parameters which work well for a wide variety of problems.  Preliminary work however has shown that a poor choice of parameters can make a staggering difference in convergence rate:  with even a very simple objective function, a good choice of parameters leads to convergence in O(10) iterations, while a poor choice may require O(10^5), if it manages to converge at all.</p>
<p>Since the spread in performance is so large, we are interested in finding not just a reasonable set of parameters, but an optimum one.  Clearly, the optimum parameters will be problem dependent, and we expect that both the form and the dimensionality of the objective function will affect the location of the optimum in parameter space, but it is possible to both quantify the "goodness" of a particular parameter choice over a wide range of sample problems, as well as find a set of parameters which will give, on average, the best performance over the classes of problems a user expects to encounter.  This is our goal.</p>
<h4>Previous approaches</h4>
<p>Obviously, we're not the first people to try to figure out how an optimization algorithm's performance depends on its parameters; in fact, this is one of the first things the creators of any new optimization algorithm are likely to consider.  For example, [1] examines the effects of varying the inertia weight on the convergence of a PSO.  Several papers since then have looked at parameter selection (examples include [2], [3], and [4]).  It would be useful here to include summaries of these papers and their conclusions, but I'm leaving that as an exercise for my student.</p>
<p>One key point is that there are multiple parameters which can and do affect performance of an optimization algorithm, and they are not necessarily independent of one another.  Studying the sensitivity of PSO to one of its parameters, such as inertia weight, is necessary but not sufficient to really understand the full convergence behavior.  This is why we're taking a slightly different approach.</p>
<h4>Experimental design</h4>
<p>Remember our objective: we want to come up with the optimal set of parameters to use in a PSO, for some reasonable definition of optimal.  This caveat is necessary due to the "no free lunch" theorem&#8212;if an algorithm works particularly well for one problem, it has to have compensating costs with other problems.  In our case, the "problem" consists of an objective function and a dimensionality, and we're interested in how both of these affect the selection of optimum parameters.  If only we had some way to find an optimum set of inputs for a desired output...</p>
<p>The logical thing to do, of course, is to apply our optimization scheme to itself.  If we define an appropriate "cost function" as our objective, and use the parameters of the optimizer as the locations of our particles in the search space, we can then simply minimize the cost function to find the optimum parameters.  For this scheme to work, we need to think carefully about the cost function we will use.</p>
<p>We are trying to account for two things with our cost function.  First, we want to ensure convergence (to a good value).  Second, we want the convergence to happen in as few function evaluations as possible. We can test whether or not the algorithm converged easily enough, and on a test problem to which we know the answer, we can even look at the quality of the converged value.  It is likewise simple enough to count the number of function evaluations for each run of the algorithm.  What is not as simple is figuring out what weights to give each of these factors.  The number of function evaluations in a convergent optimization is likely to be somewhere between 10 and 100,000; the actual value (or its difference from the true value) is 10^-8 or smaller, depending on how we set the tolerances.  To ensure that the two effects are roughly of equal strength, we take the log of both numbers and add them together.  This may not be exactly the right choice, but it is probably headed in the right direction.</p>
<p>Furthermore, since we're choosing the initial distribution of particles randomly, there could be significant variation in convergence rate due to the initial distribution.  To mitigate this, we will run each optimization a number of times, and average the results.</p>
<h4>Implementation</h4>
<p>Here is our cost function, implemented in python (code tag seems to get rid of indentation for some reason):<br />
<code><br />
def optimizationcost(x):<br />
    oc1 = x[0]<br />
    oc2 = x[1]<br />
    oo = x[2]<br />
    onp = x[3]<br />
    costfunc = 0<br />
    for i in xrange(meta_avg_iter):<br />
        opt = ParticleSwarmOptimizer(ndim=metandim,c1=oc1,c2=oc2,omega=oo,nparticles=onp,checkpointing=False,idstring="inner")<br />
	opt.checkpointing = False # don't checkpoint the small ones<br />
	opt.outputs = False # don't print output<br />
        opt.optimize()<br />
        costfunc = costfunc + np.log(opt.bestval) + np.log(opt.func_count)<br />
    costfunc = costfunc / meta_avg_iter<br />
    return costfunc<br />
</code><br />
Running through piece by piece, we unpack the "position" vector into the parameters which govern our particle swarm optimizer: c1, c2, omega, and the number of particles.  Initialize the cost function to zero, and then loop over the number of iterations we've chosen to give a reasonable average value.  For each of these iterations, create a new particle swarm with the parameters passed in as the "position", and optimize it; take the log of the ending value and the function count, and add them to the cost function.  When we're done, divide the cost function by the number of iterations to get the average, and return it.</p>
<p>This doesn't specifically address the question of whether or not the PSO converged.  If it didn't, the function count will be a maximum, and presumably the final value won't be very good, both of which will increase the cost function.  It isn't clear yet if this is enough of a penalty for non-convergence, or if we need to add an additional cost for those cases which don't make it to the convergence criteria.</p>
<p>This is called by the following code:<code><br />
def metaoptimize():<br />
    meta = ParticleSwarmOptimizer(ndim=4,nparticles=20,xmin=[0.5, 0.5, 0.5, 2],xmax=[2.0, 2.0, 2.0, 10],idstring="metaoptimization")<br />
    meta.objfunc = optimizationcost<br />
    meta.checkpoint_interval = 1 # this is costly, so checkpoint frequently<br />
    meta.optimize()<br />
    print '-'*40<br />
    print 'optimum values:'<br />
    print 'c1, c2, omega, nparticles'<br />
    print meta.xghat<br />
    print 'best cost function:'<br />
    print meta.bestval</code></p>
<h4>Preliminary results</h4>
<p>Running this meta-optimization for a two dimensional DeJong spherical function produced the output:<code><br />
----------------------------------------<br />
optimum values:<br />
c1, c2, omega, nparticles<br />
[0.60421796126309335, 0.7165202768534048, 0.57727383847416636, 5.9018615370040104]<br />
best cost function:<br />
-16.1689363045</code>  However, since the meta-optimization did not, itself converge, I'm not sure how much to trust this. Also, running a separate simulation afterward with these parameters did not seem to lead to a very well converged solution, so there may be something else which is wrong here.  We'll need to adjust tolerances and number of iterations and run again to be more confident in these results.</p>
<h4>Where do we go from here?</h4>
<p>Once we are completely confident that our values for the 2D DeJong Spherical function make sense, we will move on to other objective functions and other dimensionalities.  This process takes long enough, however, that I'm working on parallelizing the code before taking this step; it seems like the speed gain from running in parallel will more than offset the time it will take to parallelize the code.</p>
<h4>References:</h4>
<div class="papercite_entry">[1]                   Shi and Eberhart, "Parameter selection in particle swarm optimization," <span style="font-style: italic">Evolutionary Programming VII</span>, pp. 591-600, 1998. <br/>    <a href="javascript:void(0)" id="papercite_6" class="papercite_toggle">[Bibtex]</a></div>
<pre class="papercite_bibtex" id="papercite_6_block"><code>@article{Shi:1998p476,
author = {Y Shi and R Eberhart},
journal = {Evolutionary Programming VII},
title = {Parameter selection in particle swarm optimization},
pages = {591--600},
year = {1998},
date-added = {2010-11-11 11:32:16 -0500},
date-modified = {2010-11-11 11:33:48 -0500},
pmid = {2605486712685716308related:VOOhWFyMKCQJ},
local-url = {file://localhost/Users/cmckay/Documents/Papers/1998/Shi/Evolutionary%20Programming%20VII%201998%20Shi.pdf},
uri = {papers://FA38B055-1173-4624-8A1D-F12C6F8B1027/Paper/p476},
rating = {0}
}</code></pre>
<div class="papercite_entry">[2]                   Zhang, Li, and Zhaołdots, "Guidelines for Parameter Selection in Particle Swarm Optimization According to Control Theory," <span style="font-style: italic">Natural Computation</span>, 2009. <br/>    <a href="javascript:void(0)" id="papercite_7" class="papercite_toggle">[Bibtex]</a></div>
<pre class="papercite_bibtex" id="papercite_7_block"><code>@article{Zhang:2009p485,
author = {W Zhang and H Li and Q Zhao{\ldots}},
journal = {Natural Computation},
title = {Guidelines for Parameter Selection in Particle Swarm Optimization According to Control Theory},
abstract = {Taking account of the inherent property of standard particle swarm optimization algorithm, the dynamic characteristics of position expectations on z-plane are studied and the effects of real and complex eigenvalues on the dynamic characteristic are discussed based on control },
year = {2009},
month = {Jan},
date-added = {2010-11-11 11:35:05 -0500},
date-modified = {2010-11-11 11:35:05 -0500},
pmid = {related:7MKmsSx4I8cJ},
URL = {http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5365850},
uri = {papers://FA38B055-1173-4624-8A1D-F12C6F8B1027/Paper/p485},
rating = {0}
}</code></pre>
<div class="papercite_entry">[3]                   Zhang, Li, and Hełdots, "Parameter Selection in Particle Swarm Optimization Based on z-Plane Analysis," <span style="font-style: italic">Jisuanji Gongcheng/ Computer łdots</span>, 2010. <br/>    <a href="javascript:void(0)" id="papercite_8" class="papercite_toggle">[Bibtex]</a></div>
<pre class="papercite_bibtex" id="papercite_8_block"><code>@article{Zhang:2010p483,
author = {W Zhang and H Li and H He{\ldots}},
journal = {Jisuanji Gongcheng/ Computer {\ldots}},
title = {Parameter Selection in Particle Swarm Optimization Based on z-Plane Analysis},
abstract = {The dynamic characteristics of position expectations on z-plane are studied considering the insufficient quantitative estimates and theoretical basis on Particle Swarm Optimization(PSO) parameter selection. The effect of module values and phase angles of the complex characteristic },
year = {2010},
month = {Jan},
date-added = {2010-11-11 11:35:05 -0500},
date-modified = {2010-11-11 11:35:05 -0500},
URL = {http://www.csa.com/partners/viewrecord.php?requester=gs&#038;collection=TRD&#038;recid=201006500371707CI},
uri = {papers://FA38B055-1173-4624-8A1D-F12C6F8B1027/Paper/p483},
rating = {0}
}</code></pre>
<div class="papercite_entry">[4]           <a href='http://dx.doi.org/10.1007/s10898-009-9493-0' class='papercite_doi' title='View document in publisher site'><img src='http://www.stumblingfool.org/blog/wp-content/plugins/papercite/img/external.png' width='10' height='10' alt='[doi]' /></a>        E. Campana, G. Fasano, and A. Pinto, "Dynamic analysis for the selection of parameters and initial population, in particle swarm optimization," <span style="font-style: italic">J Glob Optim</span>, vol. 48, iss. 3, pp. 347-397, 2010. <br/>    <a href="javascript:void(0)" id="papercite_9" class="papercite_toggle">[Bibtex]</a></div>
<pre class="papercite_bibtex" id="papercite_9_block"><code>@article{Campana:2010p465,
author = {Emilio F Campana and Giovanni Fasano and Antonio Pinto},
journal = {J Glob Optim},
title = {Dynamic analysis for the selection of parameters and initial population, in particle swarm optimization},
abstract = {},
number = {3},
pages = {347--397},
volume = {48},
year = {2010},
month = {Nov},
keywords = {},
date-added = {2010-11-11 11:32:21 -0500},
date-modified = {2010-11-11 11:32:23 -0500},
doi = {10.1007/s10898-009-9493-0},
pii = {9493},
local-url = {file://localhost/Users/cmckay/Documents/Papers/2010/Campana/J%20Glob%20Optim%202010%20Campana.pdf},
uri = {papers://FA38B055-1173-4624-8A1D-F12C6F8B1027/Paper/p465},
rating = {0}
}</code></pre>
]]></content:encoded>
			<wfw:commentRss>http://www.stumblingfool.org/blog/?feed=rss2&#038;p=110</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Data plotting, three (or four) ways, part II</title>
		<link>http://www.stumblingfool.org/blog/?p=96</link>
		<comments>http://www.stumblingfool.org/blog/?p=96#comments</comments>
		<pubDate>Fri, 29 Oct 2010 03:05:00 +0000</pubDate>
		<dc:creator>CQM</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.stumblingfool.org/blog/?p=96</guid>
		<description><![CDATA[Having briefly covered gnuplot for interactive plotting, I'm now going to cover Octave. First, we might ask why anyone would need more than gnuplot. It does, after all, seem relatively powerful, and with a bit of coaxing it can produce reasonably good looking output. Consider, however, the case where you want to do some kind [...]]]></description>
			<content:encoded><![CDATA[<p>Having briefly covered gnuplot for interactive plotting, I'm now going to cover Octave.  First, we might ask why anyone would need more than gnuplot.  It does, after all, seem relatively powerful, and with a bit of coaxing it can produce reasonably good looking output.  Consider, however, the case where you want to do some kind of manipulation to your data before plotting it.  Perhaps you want to integrate, or differentiate.  Perhaps you want to do something more complicated, like look at a Fourier transform, or a deconvolution.  These are things that cannot be done in gnuplot.</p>
<p>So, enter Octave.  Octave is a freely available, cross platform environment for doing matrix calculations.  It is essentially a drop in replacement for Matlab.  Matlab does have a broader set of capabilities, but chances are very good that if you're trying to do something that Matlab can do and Octave can't, you don't need to read this post.  One of the upsides to this is that the many, many resources available to Matlab users can be used essentially without alteration in learning to use Octave.  In addition, there are some Octave specific guides and tutorials available on the web (teh google is your friend).</p>
<p>Octave uses gnuplot as a backend for its plotting functions, so the limitations I discussed in my previous post (as well as a few others) apply here, as well.</p>
<h4>An example</h4>
<p>My first example for this post is data taken as part of one of the capstone projects I'm advising.  My student has a file with several thousand samples of three components of an acceleration vector, along with the times at which those samples were collected.  The data is in a CSV (comma separated values) file.  A few sample lines:<br />
<code><br />
0.179869, 0.054337, 0.072449, -0.996170<br />
0.196659, 0.072449, 0.054337, -1.032394<br />
0.213507, 0.090561, 0.000000, -1.032394<br />
0.230280, 0.090561, 0.000000, -1.032394<br />
0.247088, 0.018112, 0.054337, -1.032394<br />
0.263834, 0.126785, 0.072449, -1.104843<br />
0.280726, 0.072449, 0.036224, -1.032394<br />
0.297697, 0.054337, 0.054337, -1.086731<br />
0.314148, 0.054337, 0.072449, -1.050507<br />
</code></p>
<p>The first column is time, the second is the x-component of acceleration, the third is the y-component, and the z-component comes last.  Gnuplot could plot this file in place as it is, but we're going to want to read the file into memory before plotting it in Octave.  We read the file using the <code>dlmread</code> (delimited read) command:<br />
<code><br />
octave-3.2.3:3> data = dlmread("tenmin.csv",",");<br />
octave-3.2.3:4> size(data)<br />
ans =<br />
   35169       4<br />
</code></p>
<p>The first argument is the filename, the second is the delimiter (in this case, a comma).  Other arguments are possible for more involved reading, but this data set is small enough that we don't need to worry about that.  We see that after reading the file, we have an array called data which has 35169 rows and 4 columns.  Next we want to separate out the various components:<br />
<code><br />
octave-3.2.3:6> t = data(:,1);<br />
octave-3.2.3:7> ax = data(:,2);<br />
octave-3.2.3:8> ay = data(:,3);<br />
octave-3.2.3:9> az = data(:,4);<br />
</code></p>
<p>In these statements we are assigning all of the rows of each column of the data array to a new variable.  Now we can plot any or all of these against t, as appropriate.  The commands<br />
<code><br />
octave-3.2.3:15> plot(t,ax,";x component;",t,ay,";y component;",t,az,";z component;")<br />
octave-3.2.3:16> xlabel("Time (s)");<br />
yoctave-3.2.3:17> ylabel("Acceleration (g)");<br />
octave-3.2.3:18> title("Acceleration data");<br />
octave-3.2.3:20> print("acc10.pdf","-dpdf");<br />
</code></p>
<p>produce the image<br />
<div id="attachment_101" class="wp-caption alignnone" style="width: 310px"><a href="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/acc10.png"><img src="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/acc10-300x227.png" alt="Full set of acceleration data" title="acc10" width="300" height="227" class="size-medium wp-image-101" /></a><p class="wp-caption-text">10 minutes of acceleration data</p></div></p>
<p>A few syntax notes:</p>
<ol>
<li>You must supply vectors of data for both x and y for each curve you want to plot.</li>
<li>You can provide a title (for the key) for a curve by including ";title;" after the x and y data are specified.  Other flags can also go here to set line color and style, as well as a few other options.</li>
<li>The annotations for the axes and the plot title come after the plot command (In gnuplot, they persisted from one plot to the next, so order didn't matter.  In octave, that's not the case under most circumstances.)</li>
<li>The operations of creating a plot and creating a file to put in a presentation, paper, or web page are distinct from one another.  We used <code>plot</code> to make the picture, and <code>print</code> to create the file.  Again, this is different than the behavior in gnuplot.</li>
<li>The plot range is not directly part of the <code>plot</code> command.  Rather, we use <code>axis</code> to set plot ranges.  Just like annotations, it is used after the initial call to <code>plot</code>.</li>
</ol>
<p>This plot was saved as pdf and then converted to png by hand (using preview on my mac).  One could also print directly to a png.  The plot is pretty close to worthless, since it's impossible to distinguish any characteristics of the data at this scale.  We'd really rather look at a short section, perhaps 10 seconds or so, and see what we can see from that.  As mentioned above, we use the <code>axis</code> command to accomplish this, so:<br />
<code><br />
octave-3.2.3:24> axis([200,210])<br />
octave-3.2.3:27> print("acc10sec.png","-dpng");<br />
</code><br />
<a href="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/acc10sec.png"><img src="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/acc10sec-300x225.png" alt="" title="acc10sec" width="300" height="225" class="alignnone size-medium wp-image-102" /></a></p>
<p>This version went straight to png, rather than being printed to pdf and then converted after the fact.  Note in particular that the fonts are different (and less readable).</p>
<h4>Now for something a little more interesting</h4>
<p>The astute reader will at this point be wondering why we're using octave, since everything we've done so far is equally possible in gnuplot.  To answer that question, I'm going to show just one more plot.  The data we've been looking at is acceleration data, with time measured in seconds, and acceleration measured in g's. I'd like to make the following changes to what we've seen so far:</p>
<ol>
<li>Change the time unit from seconds to minutes.</li>
<li>Remove the mean acceleration to account for an imperfect orientation of my accelerometer.  If the initial and final velocities of the device were the same, the mean acceleration over the whole interval should be zero.</li>
<li>Integrate to show velocity rather than acceleration</li>
</ol>
<p>We do all of this with the sequence of commands:<br />
<code><br />
octave-3.2.3:59> tminutes = t/60;<br />
octave-3.2.3:61> rmksvx = 1./50.0*cumtrapz(9.8*(ax-mean(ax)));<br />
octave-3.2.3:61> rmksvy = 1./50.0*cumtrapz(9.8*(ay-mean(ay)));<br />
octave-3.2.3:62> rmksvz = 1./50.0*cumtrapz(9.8*(az-mean(az)));<br />
octave-3.2.3:65> plot(tminutes,rmksvx,";vx;",tminutes,rmksvy,";vy;",tminutes,rmksvz,";vz;")<br />
octave-3.2.3:66> grid<br />
octave-3.2.3:67> xlabel("Time (min)")<br />
octave-3.2.3:68> ylabel("Velocity (m/s)");<br />
octave-3.2.3:70> title("Corrected velocity data");<br />
octave-3.2.3:71> print("corvall.pdf","-dpdf");<br />
octave-3.2.3:72> axis([3.5,4])<br />
octave-3.2.3:73> print("corvzoom.pdf","-dpdf");<br />
</code></p>
<p>The following two plots are the result:<br />
<a href="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/corvall.png"><img src="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/corvall-300x240.png" alt="" title="corvall" width="300" height="240" class="size-medium wp-image-104" /></a><br />
<a href="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/corvzoom.png"><img src="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/corvzoom-300x248.png" alt="" title="corvzoom" width="300" height="248" class="size-medium wp-image-105" /></a></p>
<p>Now, clearly this needs some work.  The data was taken in a crew boat, with the y-axis parallel to the long axis of the boat.  We don't expect that the crosswise velocity will ever be greater than the parallel velocity, so our simpleminded approach of subtracting the average acceleration isn't good enough.  Perhaps the next step will be to see if we can extract the features of interest using some kind of decomposition (Fourier? Chebyshev? Wavelets? EOFs?) and appropriate filters. That isn't really the point here, though.  The point is that we can do some sophisticated manipulations of our data with very little effort in octave.  </p>
<h4>Parting Shots</h4>
<p>Octave accomplishes something significant, I think.  It manages to occupy the space in between "code to generate results", and "external package only for plotting". In fact, it can be used to craft whole simulations (and people use both Octave and Matlab for that purpose), or it could be used just to generate plots.  My personal feeling, though, is that if you're not using it to generate your data, octave is most useful when you have some nontrivial manipulation of your data to perform before you generate the plots.  In contrast to a more traditional programming language, Octave's focus is on mathematical transformations of matrices and vectors; this means that there are a lot of operations which would take some care, significant thought, and/or access to a third-party library to pull off in python or fortran which are trivial (or nearly so) in Octave.</p>
<p>Symbolic math tools, such as Maxima and Mathematica, fall somewhat into the same category, but Octave's focus on arrays and matrices rather than functions is important. It is definitely an oversimplification to say that one should use Octave for discrete tasks and Maxima for continuous ones, but it's a decent rule of thumb.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.stumblingfool.org/blog/?feed=rss2&#038;p=96</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Data plotting, three (or four) ways, part I</title>
		<link>http://www.stumblingfool.org/blog/?p=85</link>
		<comments>http://www.stumblingfool.org/blog/?p=85#comments</comments>
		<pubDate>Tue, 26 Oct 2010 15:40:23 +0000</pubDate>
		<dc:creator>CQM</dc:creator>
				<category><![CDATA[Visualization]]></category>

		<guid isPermaLink="false">http://www.stumblingfool.org/blog/?p=85</guid>
		<description><![CDATA[There are a large number of tools one can use to visualize data. I'm comfortable using a few of them, and there are a few more I could use if pressed. I generally separate visualization tools into three categories, based on how close they are to the source of the data we're trying to plot: [...]]]></description>
			<content:encoded><![CDATA[<p>There are a large number of tools one can use to visualize data.  I'm comfortable using a few of them, and there are a few more I could use if pressed.  I generally separate visualization tools into three categories, based on how close they are to the source of the data we're trying to plot:</p>
<ol>
<li>Interactive plotting tools which display data that has been previously generated,
<li>Scriptable tools which can do some operations on the data, and
<li>Visualization built into the data source.
</ol>
<p>There's obviously some overlap between these areas.  Nearly all scriptable tools are also somewhat interactive; there are ways to store data so that even  simple interactive tools can do some things in the way of operating on the data, and so on.  I'm going to give some details about implementing plots in three tools (with examples): gnuplot, octave, and python.</p>
<h4>Quick and simple: gnuplot</h4>
<p>Gnuplot is a freely available, cross platform interactive plotting environment.  There are several tutorials available on the web; I think <a href=http://www.duke.edu/~hpgavin/gnuplot.html>this one</a> is pretty comprehensive.  The authors have prepared a <a href=www.gnuplot.info/docs_4.0/gpcard.pdf>reference card (pdf)</a>, and the full <a href=http://www.gnuplot.info/documentation.html>documentation</a> is also available.  Finally, if you just want a quick something to jog your memory, the interactive help system works reasonably well.  Just type help at the prompt, and it will guide you the rest of the way through.</p>
<p>I'm not going to try to produce a gnuplot guide more complete or more useful than what already exists, but I think it is worthwhile to give a couple of examples, caveats, and use cases. Consider, for example, figure 10 from my previous post:<br />
<div id="attachment_70" class="wp-caption alignnone" style="width: 650px"><a href="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/out-16.png"><img src="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/out-16.png" alt="Time series data" title="out-16" width="640" height="480" class="size-full wp-image-70" /></a><p class="wp-caption-text">Figure 10 from my previous post</p></div></p>
<p>This plot was generated with the set of commands:<br />
<code><br />
gnuplot> set title 'Two days of process S, process C, and sleep propensity'<br />
gnuplot> set ylabel 'sleep propensity (arb. units)'<br />
gnuplot> set xlabel 'time (hours)'<br />
gnuplot> set grid<br />
gnuplot> set term png<br />
gnuplot> set output 'out-16-zoomed.png'<br />
gnuplot> plot [168:216]'out-16' with lines title 'process S', 'out-16' using 1:3 with lines title 'process C', 'out-16' using 1:4 with lines title 'sleep propensity'<br />
</code><br />
(gnuplot gives some output which I have removed for clarity).  It's fine to use png for embedding in a web page, but if the plot is going to be resized or included in a document that will be printed, raster formats like png (or, even worse, something lossy like jpeg) are sub-optimal.  The best options for vector formats are either postscript or pdf.  Personally, I prefer pdf most of the time, but the postscript driver allows you a wider selection of typefaces.</p>
<p>For good or ill, gnuplot's output drivers give slightly different results for different devices.  For example, here is the pdf version of the above plot (converted to png for web display):<br />
<div id="attachment_90" class="wp-caption alignnone" style="width: 760px"><a href="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/out-16-zoomed-from-pdf.png"><img src="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/out-16-zoomed-from-pdf.png" alt="" title="out-16-zoomed-from-pdf" width="750" height="451" class="size-full wp-image-90" /></a><p class="wp-caption-text">The same image as above, but output originally as a pdf</p></div></p>
<p>Both the size and the aspect ratio are different from the png version, as are the fonts and some line attributes. Helvetica is the default font, which is fine for displaying on a screen, but I prefer a serifed font in print.  Unfortunately, font selection is limited to the 'pdf core fonts': Times, Helvetica, Symbol, and Zapf Dingbats (along with a couple of variations in weight and angle).  Also, if the plot is going to be scaled down at all, the font size should probably be increased to improve readability.  I would make some additional changes, as well, but those really fall under the heading of data graphic design, which is the subject of a future post.</p>
<p>One of the strengths of gnuplot is that it is interactive, so you can make small changes to your plot and see them updated quickly.  I always look at plots interactively before sending them to a file.  When you start gnuplot, interactive behavior is the default; the 'set terminal' and 'set output' commands in the example are what change the behavior.  Typically, I will get the plot looking exactly how I want it in a window on the screen, and then set the terminal and output, and then type 'replot' to simply reproduce what I see on my screen in the file.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.stumblingfool.org/blog/?feed=rss2&#038;p=85</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How not to present data</title>
		<link>http://www.stumblingfool.org/blog/?p=36</link>
		<comments>http://www.stumblingfool.org/blog/?p=36#comments</comments>
		<pubDate>Sun, 24 Oct 2010 03:58:57 +0000</pubDate>
		<dc:creator>CQM</dc:creator>
				<category><![CDATA[Sleep]]></category>
		<category><![CDATA[Visualization]]></category>

		<guid isPermaLink="false">http://www.stumblingfool.org/blog/?p=36</guid>
		<description><![CDATA[One of my students presented some data in our research meeting today, and it did not go well. It was essentially my fault, so I thought I'd take a look at the situation after the fact and try to see how to rehabilitate the presentation. The problem First, the graphs as they were presented: Looking [...]]]></description>
			<content:encoded><![CDATA[<p>One of my students presented some data in our research meeting today, and it did not go well.  It was essentially my fault, so I thought I'd take a look at the situation after the fact and try to see how to rehabilitate the presentation.</p>
<h3>The problem</h3>
<p>First, the graphs as they were presented:<br />
<div id="attachment_40" class="wp-caption alignnone" style="width: 310px"><a href="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/times-10.png"><img src="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/times-10-300x180.png" alt="" title="times-10" width="300" height="180" class="size-medium wp-image-40" /></a><p class="wp-caption-text">Figure 1: 1-10</p></div></p>
<div id="attachment_38" class="wp-caption alignnone" style="width: 310px"><a href="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/times-20.png"><img src="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/times-20-300x180.png" alt="" title="times-20" width="300" height="180" class="size-medium wp-image-38" /></a><p class="wp-caption-text">Figure 2: 10-20</p></div>
<div id="attachment_42" class="wp-caption alignnone" style="width: 310px"><a href="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/times-30.png"><img src="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/times-30-300x180.png" alt="" title="times-30" width="300" height="180" class="size-medium wp-image-42" /></a><p class="wp-caption-text">Figure 3: 20-30</p></div>
<p>Looking at these plots, what do we see?  There are no meaningful titles or legends; the axes are not labeled, and the difference between vertical and horizontal scale on the second plot make it very hard to see what is going on.  Basically, there is no <i>story</i> there, or if there is, it's impossible for an uninformed viewer to tease it out.</p>
<p>The point of data in any presentation is to <i>support a narrative</i>.  That is, you're trying to tell a story to your audience, and the data (hopefully) makes your version of events the most credible one. To serve this purpose, the audience has to know both what you are trying to show them and why.  Furthermore, if the data is going to be a convincing part of your story, it must be presented in such a way that the audience can test out alternative explanations.  If there isn't enough information in your graphics for them to do that, or if the information you are showing them is organized in such a way that it is hard to reason about the narrative and how the data fits into it, your graphics are not serving their purpose.</p>
<h3>How this happened</h3>
<p>I said earlier that this was essentially my fault, and I stand by that.  The graphs my student used in his presentation were graphs which I had generated (or rather, which I had made his code generate), and when he asked me what he should present today, I said that the information in the graphs was where he ought to start.  I had generated the graphs quickly (using a horrible kludge, if the truth be known), in an effort to get a general feel for the data his simulation was generating, and without any thought of presenting it to anyone else.  These graphs were among a set we generated at the end of our one-on-one meeting, after we had looked at a number of other things, as well, and I was clear on both what I was looking at and what I wanted to look for.</p>
<p>He put these three graphs into a 6 (or so) slide powerpoint deck, and talked while showing us the slides.  As I haven't provided the text of his talk above, I have made the graphs somewhat more difficult to understand than they were at the time.  However, very little of the context we had when generating the graphs was present in his talk, and so it was understandable that some members of the audience were less than convinced.</p>
<h3>Let's try this again</h3>
<p>Rather than just posting new plots, let me tell you the <i>story</i>. </p>
<p>This project is an examination of the two-process model of sleep regulation.  Typically sleep researchers who do modeling studies are trying to tune their models so that they match actual sleep data.  We're approaching the problem from a more mathematical angle, focusing on the question of what solutions are permitted by the model, and for what parameter ranges those solutions apply.  In particular, we're interested in finding multiply periodic, aperiodic, or chaotic solutions, if they exist. A closely related question is what features or refinements we need to include in the model to obtain such solutions.</p>
<p>The two process model of sleep regulation posits that the tendency to fall asleep and to wake up is governed by the combined effect of two distinct processes.  The first process ('Process C') is a periodic process, with a period of about 24 hours.  Since the main component of this process is roughly day length, it is referred to as circadian, but to capture all of the effects of process C (most people feel sleepy in the afternoon, for example), it is necessary to include some higher frequency modes. The second process, ('Process S'), is homeostatic in nature.  It increases the person's need to sleep while they are awake, and decreases the need to sleep while the person is sleeping. A weighted sum of these two processes gives the overall sleep propensity, or likelihood that the person will fall asleep.  The actual scheme for transitioning between sleep and wakefulness is not specified by the model, but is necessary for implementation.  We'll come back to that point in a minute.</p>
<p>My student wrote a python script which evolves the two process model for about 60 days (we wanted the duration to be long enough to get rid of any initial transients).  The script outputs numerical values for process S, process C, and their sum (which we will call P) once per minute for the duration of the simulation.  Further, each time the person wakes up, the script records the time and the length of time the person was asleep. This record of sleep durations is a convenient way to determine some things about the dynamics.  In particular, we can see if the phase space orbit has a regular period.</p>
<p>Mathematically, Achermann defines process C as<br />
<p style='text-align:center;'><span class='MathJax_Preview'><img src='http://www.stumblingfool.org/blog/wp-content/plugins/latex/cache/tex_266922f82952b7c8447b51ca585ba042.gif' style='' class='tex' alt="C(k) = \sin(\phi + {2\pi k\over \tau_c}), " /></span><script type='math/tex' mode='display'>C(k) = \sin(\phi + {2\pi k\over \tau_c}), </script></p></p>
<p> and process S as </p>
<p><p style='text-align:center;'><span class='MathJax_Preview'><img src='http://www.stumblingfool.org/blog/wp-content/plugins/latex/cache/tex_4c56cecf42d14d7cf49b7510eaf0f165.gif' style='' class='tex' alt=" S(k) = e^{-T_s/\tau_s}S({k-1}) " /></span><script type='math/tex' mode='display'> S(k) = e^{-T_s/\tau_s}S({k-1}) </script></p></p>
<p> when sleeping, and </p>
<p><p style='text-align:center;'><span class='MathJax_Preview'><img src='http://www.stumblingfool.org/blog/wp-content/plugins/latex/cache/tex_4ededa4fc3b9edbefb9312ce7f641c0a.gif' style='' class='tex' alt=" S(k) = 1-e^{-T_s/\tau_w}(1-S({k-1}))" /></span><script type='math/tex' mode='display'> S(k) = 1-e^{-T_s/\tau_w}(1-S({k-1}))</script></p> </p>
<p>when awake.</p>
<p>As written here, the model has three timescales, a phase, and a relative amplitude (the sleep propensity function is a linear combination of processes C and S, but exactly how much of each is adjustable), for a total of five parameters.  Of these, only the phase can be consciously controlled; there is presumably some inter-individual variation in the timescales, and it may be possible for an individual to adjust his or her timescales.  As far as the relative importance of process C and process S, it isn't clear to me yet if that is something which can be changed or not.  So, as a first approximation, if we're looking for interesting model behavior, the parameter which we should consider varying first is <span class='MathJax_Preview'><img src='http://www.stumblingfool.org/blog/wp-content/plugins/latex/cache/tex_1ed346930917426bc46d41e22cc525ec.gif' style=' padding-bottom:1px;' class='tex' alt="\phi" /></span><script type='math/tex'>\phi</script>.</p>
<p>For our first look at the model, we chose the timescales</p>
<table>
<tr>
<td><span class='MathJax_Preview'><img src='http://www.stumblingfool.org/blog/wp-content/plugins/latex/cache/tex_24f068c5eda18182c4e987e30361dde5.gif' style=' padding-bottom:2px;' class='tex' alt="\tau_c" /></span><script type='math/tex'>\tau_c</script></td>
<td> 24 hours</td>
</tr>
<tr>
<td><span class='MathJax_Preview'><img src='http://www.stumblingfool.org/blog/wp-content/plugins/latex/cache/tex_c20b4b6a5284904ea5b10d4e5b818f5a.gif' style=' padding-bottom:2px;' class='tex' alt="\tau_w" /></span><script type='math/tex'>\tau_w</script></td>
<td> 4 hours</td>
</tr>
<tr>
<td><span class='MathJax_Preview'><img src='http://www.stumblingfool.org/blog/wp-content/plugins/latex/cache/tex_e0d09e8d52c2fa24dbd598b5a5cce80f.gif' style=' padding-bottom:2px;' class='tex' alt="\tau_s" /></span><script type='math/tex'>\tau_s</script></td>
<td> 2 hours</td>
</tr>
</table>
<p>based mostly on the idea that the average person is awake twice as long as they are asleep.  We weighted the two processes equally, and varied the phase between 0 and 2 pi in 100 steps.</p>
<p>The last piece of the model needed before we could run a simulation was to decide on the transition conditions between waking and sleeping.  There are a number of possible transition conditions, but they generally fall into two classes:  set waking and sleeping times, or sleep propensity function thresholds.  We decided to include both, and to additionally have the option of some randomness to account for individual variations across days.  Since most people seem to wake at a set time, but go to sleep when they are tired, we did not impose a time to fall asleep, but rather relied on an upper threshold on sleep propensity to make that transition.  Waking happens in the simulation when either the person hits a lower threshold on sleep propensity (i.e., the person is sufficiently rested to wake up) or a particular time is reached (i.e., the alarm clock goes off).  The plots I showed at the beginning of this post include some random variation in the threshold to fall asleep, which accounts for most of the variability present in those plots. (I'll come back to this in a moment).</p>
<p>At this point, we are ready to start exploring parameter space.  As discussed above, the first parameter we're going to vary is phi, and this leads us to our first use of data graphics.  We want to be able to see as much of the landscape of parameter space as possible, to quickly zero in on the area(s) where we see interesting behavior (assuming we find some such areas).  Although it isn't the best practice, it is not terribly uncommon to use poorly annotated plots for this kind of thing, because
<ol>
<li>you expect to only use them as a stepping stone to get to the really interesting stuff, and
<li>often you're doing them interactively, and adding titles and labels when you're doing a lot of interactive plots is a little on the tedious side (especially when you know what everything is; you're generating the plots interactively, after all).</ol>
<p>If you're having your code automatically generate plots, you might as well have it add all of the labels required to make the plots sensible to other people as well; it takes only a little bit of work to set it up once, and then all of your plots are much more accessible. </p>
<h3>Revised plots</h3>
<p>Running the code produces time series for process C, process S, and the sleep propensity function (which in this case is just the sum of the two).  A sample plot of the full 60 days of a simulation is below:<br />
<div id="attachment_59" class="wp-caption alignnone" style="width: 310px"><a href="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/out-56-all.png"><img src="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/out-56-all-300x225.png" alt="60 days of time series data" title="out-56-all" width="300" height="225" class="size-medium wp-image-59" /></a><p class="wp-caption-text">Figure 4: 60 days of sleep propensity data</p></div> </p>
<p>This plot is not very useful.  It's impossible to distinguish any difference between the shapes of S, C, and P, it's impossible to see the timescales, and it's impossible to tell if the apparent variations in extrema are just due to aliasing, or if they're real.  Zooming in, however, makes the picture clearer.<br />
<div id="attachment_61" class="wp-caption alignnone" style="width: 310px"><a href="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/out-56-week.png"><img src="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/out-56-week-300x225.png" alt="One week of time series data" title="out-56-week" width="300" height="225" class="size-medium wp-image-61" /></a><p class="wp-caption-text">Figure 5: A week's worth of sleep propensity data</p></div></p>
<p>This is better. The shapes are clear, the amplitude looks constant, and we can clearly distinguish the periods of sleep (where process S has a downward slope) from waking (where process S has an upward slope).  It still feels a little tight, though, so if I were going to show this to someone else, I would probably only do two days, rather than a whole week:<br />
<div id="attachment_64" class="wp-caption alignnone" style="width: 310px"><a href="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/out-56.png"><img src="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/out-56-300x225.png" alt="Two days of time series data" title="out-56" width="300" height="225" class="size-medium wp-image-64" /></a><p class="wp-caption-text">Figure 6: Two days of sleep propensity</p></div></p>
<p>While the two-day view is good for seeing what is going on over a day or two, what we really want to know is if things like wake time and sleep duration vary much (or at all) across days.  To answer that question, we generate a second kind of plot, which shows the duration of the most recent sleep episode.  We could also ask the related (but distinct) question of how much sleep the simulated test subject has had during each 24 hour period.  For now, let's just consider the most recent sleep episode.  For the example case we've been considering (phi = 2 pi * 56/100), the plot looks like this:<br />
<div id="attachment_57" class="wp-caption alignnone" style="width: 310px"><a href="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/times-56-points.png"><img src="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/times-56-points-300x225.png" alt="" title="times-56-points" width="300" height="225" class="alignnone size-medium wp-image-66" /></a><p class="wp-caption-text">Figure 7</p></div></p>
<p>This plot, too, can be improved.  We use points instead of lines because the data is discrete; we're talking about the durations of individual sleep episodes, and the durations are reported at the end of each episode. We don't have a problem looking at the whole 60 days of data, but the way the points vary on the plot makes it hard to easily see how big the variations in sleep length actually are.   We can provide some scale by fixing<br />
the limits of the y axis at 0 and 24 hours.</p>
<div id="attachment_100" class="wp-caption alignnone" style="width: 310px"><a href="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/times-56-fixedy.png"><img src="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/times-56-fixedy-300x225.png" alt="" title="times-56-fixedy" width="300" height="225" class="alignnone size-medium wp-image-67" /></a><p class="wp-caption-text">Figure 8</p></div>
<p>Although the data are discrete, we can still ask whether adding lines might be helpful.</p>
<div id="attachment_101" class="wp-caption alignnone" style="width: 310px"><a href="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/times-56-lp-fixedy.png"><img src="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/times-56-lp-fixedy-300x225.png" alt="" title="times-56-lp-fixedy" width="300" height="225" class="alignnone size-medium wp-image-68" /></a><p class="wp-caption-text">Figure 9</p></div>
<p>In this case, it doesn't look like it helps much.</p>
<p>The three plots at the beginning of this post are of this type, but they
<ol>
<li>don't have the annotations,
<li>contain 10 (poorly labeled) values of phi on each plot, and
<li>don't have the fixed y-scale.</ol>
<p> Now that we have enough context, we can make sense of those plots, in spite of their faults.  Since each one contains 10 sets of data, it is possible to see much more of the parameter space than it is from plots of individual curves.  From those plots (and the others like it covering the rest of the range over which we varied phi) we can see that the overall features of most of the values of phi are essentially like those exhibited by the phi = 0.56*2*pi case.  However, in the range between phi = 0.1*2*pi and phi=0.2*2*pi (plot 2 above), something different is going on.</p>
<p>Looking at the time series of S, C, and P, we can easily see what is happening:<br />
<div id="attachment_102" class="wp-caption alignnone" style="width: 310px"><a href="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/out-16.png"><img src="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/out-16-300x225.png" alt="" title="out-16" width="300" height="225" class="alignnone size-medium wp-image-70" /></a><p class="wp-caption-text">Figure 10</p></div></p>
<p>The person falls asleep when tired, shortly before the alarm goes off in the morning.  When the alarm goes off, they get up, but are tired enough that they need to go to sleep again soon, at which time they sleep much longer.  thus they alternate between short and long sleep periods.  Indeed when we look at the sleep duration plot, this is exactly what we see.<br />
<div id="attachment_103" class="wp-caption alignnone" style="width: 310px"><a href="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/times-16-zoom.png"><img src="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/times-16-zoom-300x225.png" alt="" title="times-16-zoom" width="300" height="225" class="alignnone size-medium wp-image-71" /></a><p class="wp-caption-text">Figure 11</p></div></p>
<p>On the upper and lower bounds of the range of phases for which this happens, we see that the person sometimes doesn't make it to bed before the alarm goes off, and thus misses the short sleep period.  On the other hand, if the sleep period starts enough before the alarm goes off, sometimes the person will miss the longer sleep episode, as is illustrated in the plot:<br />
<div id="attachment_104" class="wp-caption alignnone" style="width: 310px"><a href="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/times-14-19-zoom.png"><img src="http://www.stumblingfool.org/blog/wp-content/uploads/2010/10/times-14-19-zoom-300x225.png" alt="" title="times-14-19-zoom" width="300" height="225" class="alignnone size-medium wp-image-72" /></a><p class="wp-caption-text">Figure 12</p></div> </p>
<p>In this case, the lines serve to guide the eye to the fact that sometimes the alternating between short and long sleep periods isn't consistent.</p>
<h3>Wrapping up</h3>
<p>Pulling this long ramble back to the title of this post, which of these would I present in a group meeting? I'd go with Figures 6, 8, 11, 10, and 12, in that order.  Figure 12 could further be strengthened by adding plots similar to figure 10 for the appropriate values of phi.  I would also be careful to add the caveat that the parameters we've used (and in particular, the sleep and wake thresholds) were chosen somewhat arbitrarily, and that they're not based in data about actual human sleep.  Of course, since we're more interested in the behavior of the model, that isn't as important as it would be if we were trying to do sleep research, but it's worth bearing in mind. (Note in particular that for the values of sleeping and waking thresholds we've used,<br />
it looks as if the sleeper only needs about 1.5 hours of sleep per day, which seems too short).</p>
<p>I haven't discussed how to generate plots.  I mentioned above that I used an awful kludge to get (some) of these.  The rest were generated interactively in gnuplot.  A better way to do this is with a native plotting library (since this code is written in python, a good candidate is matplotlib), but I'll save that for a future post.</p>
<p>One last consideration is that the design of the plots (in terms of information content, and so on) is also something I haven't discussed here.  I'm by no means an expert on this, but I'd be a bit more careful than I have been here to maximize the useful data density of the plots if I were presenting them in a more formal setting.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.stumblingfool.org/blog/?feed=rss2&#038;p=36</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Termination conditions</title>
		<link>http://www.stumblingfool.org/blog/?p=30</link>
		<comments>http://www.stumblingfool.org/blog/?p=30#comments</comments>
		<pubDate>Fri, 15 Oct 2010 12:26:19 +0000</pubDate>
		<dc:creator>CQM</dc:creator>
				<category><![CDATA[Optimization]]></category>

		<guid isPermaLink="false">http://www.stumblingfool.org/blog/?p=30</guid>
		<description><![CDATA[One of the projects I'm supervising/working on this year is connected with optimization, and in particular, particle swarm based optimization algorithms. One of the things we've been struggling with is termination conditions, and so I thought I'd write a bit about it to clear my head. Any useful algorithm has to terminate at some point. [...]]]></description>
			<content:encoded><![CDATA[<p>One of the projects I'm supervising/working on this year is connected with optimization, and in particular, particle swarm based optimization algorithms.  One of the things we've been struggling with is termination conditions, and so I thought I'd write a bit about it to clear my head.</p>
<p>Any useful algorithm has to terminate at some point.  Usually, algorithms terminate when their work is done, but with optimization we have the difficulty that we don't exactly know when the work is done.  That is, we're done when we're "close enough" to the optimal value for our purposes, but since we don't in general know the optimum value to begin with (if we did, we wouldn't need to run an optimization algorithm, now would we?) it's hard to figure out when we're "close enough" to it.  So, we have to go by other measures.</p>
<p>The first, failsafe, condition is what I'd like to call the tl;dr condition, or the impatience condition.  That is, when we've hit some maximum number of iterations, we quit.  This is a fairly standard failsafe for many algorithms (not just optimization); if it looks like getting to an answer is going to wear out our patience, quit, and use whatever we've come up with by the time we get tired of waiting.  This is guaranteed to terminate, and it's easy to figure out the maximum time it will take to return an answer.  </p>
<p>There's some artistry (or at very least, experience) involved in choosing the maximum number of iterations; given enough intuition, it should be possible to say, "if we haven't reached a solution by now, we're going to have to wait longer than we're interested in waiting."  This time may be shorter than the actual time we're interested in waiting, and if so, we actually win by bailing out early.  I don't currently have the data yet to know if that would be the case with the algorithm we're currently using, but I'll throw it out as something to consider in the future.</p>
<p>The second possible termination condition is that the particles in our swarm have all stopped moving.  If that is the case, then clearly we aren't going to get any better values, so we may as well stop looking.  This one is easy to check, but since I haven't implemented it yet, I don't know how often it comes up. Implementation-wise, of course, we would check that the mean of the L2 norms of the velocity vectors is small.  </p>
<p>This is all fine. What is bothering me a little is that both of these could take considerable time to realize, and I feel like there ought to be a third termination condition that ought to be more easily achieved.  Usually, there will be something along the lines of, if the change in best value is below a threshold, then we say we're converged.  The problem in this case is that the best value doesn't change at every step (by design).  Since the swarm as a whole could potentially be moving away from good values and toward bad values, it's possible that it could take a number of iterations for the best<br />
value so far to improve.</p>
<p>We first tried to incorporate this into our model using this expression:<br />
<code><br />
if (abs(prevbestval - bestval) < epsilon) then<br />
  time_to_quit = .true.<br />
else<br />
  time_to_quit = .false.<br />
end if<br />
</code><br />
but found that if we make this check too frequently, the variables <code>prevbestval</code> and <code>bestval</code> are the same, and it terminates. A workaround to this problem is to only do the check at some specified interval, which is hopefully long enough to allow the best value to shift between checks.  We have found that <code>bestval</code> can stay constant for quite a while between changes (it would be good to put together some data for this), though, and so picking the right check frequency is hard.</p>
<p>We then tried<br />
<code><br />
if ( (prevbestval /= bestval )  .and. (abs(prevbestval - bestval) < epsilon)) then<br />
  time_to_quit = .true.<br />
else<br />
  time_to_quit = .false.<br />
end if<br />
</code><br />
which is equivalent to doing the other test if we update <code>prevbestval</code> only when <code>bestval</code> changes.  The problem with this approach is that it's possible to get to a "converged" best value in a bigger step than epsilon, but then after that, <code>bestval</code> won't change, so the termination condition never gets tripped.</p>
<p>What I would like is a check of this last type, but which is a bit more reliable.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.stumblingfool.org/blog/?feed=rss2&#038;p=30</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

