<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

  <title><![CDATA[IFHO]]></title>
  <link href="http://gurgeh.github.com/atom.xml" rel="self"/>
  <link href="http://gurgeh.github.com/"/>
  <updated>2013-09-17T14:58:33+02:00</updated>
  <id>http://gurgeh.github.com/</id>
  <author>
    <name><![CDATA[David Fendrich]]></name>
    
  </author>
  <generator uri="http://octopress.org/">Octopress</generator>

  
  <entry>
    <title type="html"><![CDATA[Big programming, small programming]]></title>
    <link href="http://gurgeh.github.com/blog/2013/09/03/big-programming/"/>
    <updated>2013-09-03T17:39:00+02:00</updated>
    <id>http://gurgeh.github.com/blog/2013/09/03/big-programming</id>
    <content type="html"><![CDATA[<p>Programming is complicated. Different programs have different abstraction levels, domains, platforms, longevity, team sizes, etc ad infinitum. There is something fundamentally different between the detailed instructions that goes into, say, computing a checksum and the abstractions when defining the flow of data in any medium-sized system.</p>

<p>I think that the divide between <em>coding the details</em> and <em>describing the flow</em> of a program is so large that a programming language could benefit immensely from keeping them conceptually separate. This belief has led me to design a new programming language - <em>Glow</em> - that has this separation at its core.</p>

<!-- more -->


<h2>Contrasts</h2>

<p><em>In the big picture</em>, we want good documentation.</p>

<p><em>In the details</em>, documentation is often outdated and redundant.</p>

<p>In the big picture, uncontrolled side effects and mutable global variables lead to an unmaintainable mess where hidden assumptions are everywhere.</p>

<p>In the details, say within one or a handful of functions, mutation can be efficient, idiomatic and easy to understand.</p>

<p>In the big picture, we often want to annotate functions with types, even if the language can infer them automatically or does not need them.</p>

<p>In the details, we don&#8217;t want to litter the code with explicit types, making it harder to read and write and harder to change implementation details.</p>

<p>In the big picture exceptions are dangerous beasts. They are essentially gotos. Non-local behaviour that can make a system hard to reason about and that can give suprising failure modes.</p>

<p>In the details, exceptions are a convenient way of signalling that the input to a non-total function did not live up to its contract or that the surrounding system (hardware, DB, etc) are not in the assumed state. If we can signal these breaches of contract compile-time, using the type system, this is great. Unfortunately not everything can be statically proven.</p>

<p>In the big picture, visual programming is very natural. It is, in fact, so natural that we often make flowcharts of how systems work. It would be better if these flowcharts would actually be the system, rather than an incomplete, possibly outdated description.</p>

<p>In the details, visual programming is just an inconvenient gimmick. IMHO. It may change one day.</p>

<h2>The feature checklist</h2>

<ul>
<li>Pythonesque syntax</li>
<li>Statically typed</li>
<li>Automatically type inferred</li>
<li>Strict evaluation</li>
<li>Garbage collected</li>
<li>Reference implementation compiled through LLVM</li>
<li>Separate functions (small) and components (big)</li>
<li>Components are similar to coroutines and have named input and output pipes (like channels in Go, pipes in Haskell, more general than generators in e.g Python), which are part of their type. Pipes are synchronous/blocking.</li>
<li>Even though pipes are synchronous, the standard libray has queue components that can be mounted on a pipe to make it asynchronous/non-blocking with an explicit queue size.</li>
<li>An input pipe can be looped over or read &#8220;manually&#8221; one value at a time. You may not peek.</li>
<li>If a component loops over an input pipe, it may be inlined into the outputting component.</li>
<li>Pipes can be attached to components with different connections, for example making the components run in parallel, in different processes, with target component on GPU, etc.</li>
<li>A kind of linear typing for mutable data</li>
<li>Mutable data cannot be shared between components. If sent from one component to another, it becomes unreadable in the first.</li>
<li>Constant data can be shared between components.</li>
<li>Algebraic datatypes, structs and vectors</li>
<li>Pattern matching</li>
<li>Components and functions are first class and can be curried.</li>
<li>A function cannot have side effects. Effects are triggered through pipes to components.</li>
<li>The C FFI has to be wrapped as components and is the only way to get effects. None are builtin.</li>
<li>Contracts for assumptions which do not fit the type system.</li>
<li>Preprocessing/macro system with the full power of the language, like Lisp or Template Haskell.</li>
<li>Simple textual metadata can be attached to components, pipes and connections, which help when laying out the connections visually in a flowchart-type interface.</li>
</ul>


<h3>Components</h3>

<p>A component is a function that may have input arguments, but not return arguments. It also has zero or more input pipes, output pipes or bidirectional pipes. A component can send data to an output pipe and it can receive data on an input pipe, either by &#8220;forall&#8221; to loop until the pipe closes or &#8220;get&#8221; to block until it gets the next value. Pipes can also be bidirectional, which means that you can both send and receive. This is convenient, for example when you want to make database queries to a component.</p>

<p>A pipe can be closed both upstream (sender has run out of data) and downstream (receiver has found what it is looking for). Since a pipe can be closed downstream, you can compose strict pipelines that behave lazily by terminating early when they are done, yet are nicely decoupled.</p>

<p>If a component has only one input pipe, it is deterministic in the sense that the behaviour is identical to a pure function with a lazy list or generator as input argument. This means that it is easier to test. If we had a &#8220;peek&#8221; function that could check if an input pipe had anything new yet, the receiving component could loop and do different stuff depending on when a new value is ready, begging for race conditions. If a component has more than one input pipe it is not deterministic in general, since &#8220;forall&#8221; can take several pipes of the same type and emit the values in the order they are produced.</p>

<p>Components may clean up after a pipe closes, which I will describe more in a later post in conjunction with exceptions.</p>

<p>A component with one input pipe and one output sounds very similar to a function. You are supposed to use a function if there is a one-to-one mapping between input and output and you do not keep state. isPrime (:: Int -> Bool) would most naturally be a function. Run length encoding, for example, would more naturally be a component.</p>

<pre><code>#no arguments, one input pipe named values and one output pipe named encoding.
runLengthEncode :: Eq a =&gt; | values a || encoding (a, Int)
runLengthEncode():
  oldval = get values
  count = 1
  foreach value in values:
    if value == oldval:
      count += 1
    else:
      put encoding (oldval, count)
      oldval = value
      count = 1
  if count:
      put encoding (oldval, count)
</code></pre>

<p>Ignore syntax, the use of Haskell-style type classes and the detail of what happens if there is no first value to get. When we have just one input and one output, this is equivalent to something like a generator using &#8220;yield&#8221; in Python.</p>

<p>I don&#8217;t think pipes need to come with any performance overhead compared to a normal function call, so they are supposed to be used a lot.</p>

<h3>Concurrency</h3>

<p>Components can be run concurrently and it should &#8220;just work&#8221;, since components cannot share mutable data. If you need shared mutable data, a component that keeps state and has more than one input pipe will no longer have any guarantees of determinism and could be used as (or wrap), for example, a database.</p>

<p>The same component could also be run concurrently in parallel against the same input pipe, as long as the input and output order does not matter. This problem is inherent in parallelism. In MapReduce, the problem of the unordered output pipe is solved by attaching a key to each indata value, which can also be attached to the output.</p>

<p>When connecting two components, the type of connection indicates how the output component is run. As mentioned in the checklist, above, you can also imagine GPU connections. Since all effects except local mutation are handled as pipe communication, a Glow component should not need many constraints to target a GPU.</p>

<h3>Effects and FFI</h3>

<p>No side-effects are built in to the language. Even stuff like current time and print are handled by a C FFI. A FFI must be wrapped as a component, so all effects are clearly visible as pipe communication. Since you might want the same output component to handle, for example, logging or file handling in many components, unattached output pipes with the same name and type are merged to one output pipe when two components with dangling outputs are connected.</p>

<h3>Testing and dependency injection</h3>

<p>Since side effects are often slightly troublesome when testing, it is my intention that a component connection can later be overriden (or monkey-patched, if you will) by a mock-component with the correct type. This means that the file system, database or clock for an entire application can be overriden for testing purposes or when you want to simply switch out a component across an application. I have not yet fully worked out how namespaces, etc, should work in this context.</p>

<h2>Visual programming</h2>

<p>We use two types of files to program Glow. The detail files, which can contain anything, and the overview files, which may only contain constants and how components are connected to each other. An overview file may not use macros.</p>

<p>Both are editable with any text editor, but the overview file can also be edited visually as a vectorized flowchart (vectorized to zoom in on details). This should have a number of benefits.
- The overview chart is good documentation that is never out of date. Not for a library API, but for a system or application.
- You can design programs on your tablet or phone, by touch. Fill in the details when you get to a keyboard.
- You get a visual REPL for debugging and testing. Change constants (read more below) and watch changes, live.
- The flowchart may also serve as a simple control panel for the application in production.</p>

<p>It is important that you should never need a special application to view the overview file. It must be perfectly legible as text, even when generated by a flowchart-editor.</p>

<p>The overview file for a simple web scraper. Not final syntax:</p>

<pre><code>LinkQueue :: queue
Scraper :: urlGet timeout=5
TheExtractor :: extractor
LinkQueue.get -&gt; Scraper.urls
Scraper.data -&gt; TheExtractor.data
Scraper.error -&gt; Print.in
TheExtractor.urls -&gt; LinkQueue.put
</code></pre>

<p>The lower case variables, like &#8220;queue&#8221; are components, but since a component can be in several places in a system, connected to different stuff and called with different arguments, each component needs an instantiation (one could imagine syntactic sugar for anonymous component instantitions. &#8220;Lambdas&#8221;). The word after the dot is the channel name. &#8220;->&#8221; is a standard connection. The flowchart for this file would be three objects connected in a circle, with &#8220;Scraper&#8221; also having a print connection.</p>

<p>Note that what you would work with in this interface is not types and class hierarchies as you might in a modelling tool, but the actual component instances. The static data. Component instances and other data that is generated dynamically, either compile-time or runtime, would not show up. Only the concrete, static data, components and connections can be in the overview file. This is also why no macros are allowed in the overview file - how would an interface edit component connections that are generated compile time and not written explicitly in the original source?</p>

<p>During runtime, the same flowchart-like interface could be used to add visualisers of data flowing through different pipes. Graphs, counters, latest value, etc, could be monitored. Perhaps color to indicate relative traffic.Just as objects in Python have a <strong>str</strong> or <strong>repl</strong> method, when they need to be printed, standard Glow data types could have html-representations rendered in the interface.</p>

<p>I believe that the component+pipes concept lends itself very well to Functional Reactive Programming. A side-effect of this is that statically defined data could have controllers that dynamically changes the program behaviour during runtime. I&#8217;ll try to flesh out my ideas for that later.</p>

<h2>In closing</h2>

<p>This was just a first overview of Glow. I have not yet written about the type system, the module system, exceptions and contracts (yes, they are tightly coupled), metaprogramming and &#8220;sublanguages&#8221;, how functional reactive programming fits with components or, you know, the actual syntax. Most of that, except syntax (meh), module system (probably something close to MixML) and parts of the type system (higher-order stuff related to MixML or type classes) are already designed.</p>

<p>Also, if it was not obvious, there is no compiler or anything else yet. Not a shred of code. I know that a compiler (and visual editor!) is a serious undertaking. I think that this particular programming language does not rely on any magic that makes it harder to implement than, say, Rust. I also think it would be different enough to stand out. Perhaps others are intrigued and willing to help out?</p>

<p>This was inspired by Bret Victor, the Haskell pipes library, Go, my experience from work and many other things. In fact, if you have not seen Bret Victor&#8217;s brilliant <a href="http://vimeo.com/71278954">&#8220;Future of programming&#8221;</a> and <a href="http://vimeo.com/36579366">&#8220;Inventing on principle&#8221;</a>, go watch them already.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[A better Soylent - designing a simple, "optimal" nutrition shake]]></title>
    <link href="http://gurgeh.github.com/blog/2013/04/09/a-better-soylent-good/"/>
    <updated>2013-04-09T14:54:00+02:00</updated>
    <id>http://gurgeh.github.com/blog/2013/04/09/a-better-soylent-good</id>
    <content type="html"><![CDATA[<p>This post was inspired by Rob&#8217;s <a href="http://robrhinehart.com/">Soylent</a> experiment. He recently got a lot of press for constructing an almost completely artificial liquid diet containing everything a human needs and then consuming this goo exclusively for a month.</p>

<p>Yes, of course it has been done before, but I thought the idea was interesting and wanted to try it myself. I am very interested in health and longevity, so I thought I&#8217;d incorporate what I consider the most important and well-founded dietary findings in my recipe. The final recipe (let&#8217;s call it &#8220;Goop&#8221;, like the food in The Matrix) will be quite cheap compared to regular food and include only natural vegetarian ingredients and a multi-vitamin (mostly for the vitamin D).</p>

<p>Nutrition is complicated. The most difficult part of any new dietary idea is that humans are extremely long lived, so it takes forever to verify the long-term effects. Also, most of us are not keen to be put in cages and fed on a rigorous schedule, so you never know quite what people really eat. You can probably tell a lot from mouse studies and you can monitor known health markers in humans, but there are no guarantees. I have not even tried this diet myself yet (though I will in just a few days). If you decide to give it a try, you do it completely at your own risk.</p>

<p>Even if you are not interested in my final recipe for a complete nutrition shake (or shake-like substance), you might find the general discussion on what to put in your body interesting.</p>

<!--more-->


<h2>The goal</h2>

<p>My goal is to design as &#8220;healthy&#8221; a diet as possible, which is at the same time as simple as possible. Palatability comes a distant third. My plan is not to consume Goop exclusively for the rest of my life, but perhaps to consume the majority of my &#8220;meals&#8221; this way. I don&#8217;t want to give up on sublime culinary experiences or that ice cream that just hits the spot once in a while. The majority of my meals or snacks currently do not fall in those categories, though. Partly it is also a fun intellectual excercise.</p>

<p>I currently do <a href="http://en.wikipedia.org/wiki/Intermittent_fasting">intermittent fasting</a> once or twice a week. Even though it is probably just as healthy to allow yourself to eat a few hundred calories during those 24 hours, to me, it just complicates matters. It is psychologically easier just to say &#8220;stop&#8221; for 24 hours than to bargain with yourself over how much food you are allowed to eat on your fasting days. I suspect the same might be true with food intake. If your diet is extremely simple and restricted, it might be easier to follow than one with a hundred vague guidelines which will give you tastier food and more variation, but many more daily decisions and temptations.</p>

<p>It may also be a starting point for a later diversification or for optimizing in the &#8220;palatable&#8221; direction. We&#8217;ll see.</p>

<h2>Calories</h2>

<p>How many calories you need is obviously a function of many variables, including gender, age, height, physical activity and genes. Fortunately, too many or too few calories are quite easy to detect. Just weigh yourself regularly. Preferably on a scale that measures body fat percentage. I know those are not very accurate, but here we do not care much about absolute numbers, just change.</p>

<p>The simplest, most researched and most effective diet tuned for health and longevity must be <a href="http://en.wikipedia.org/wiki/Calorie_restriction">calorie restriction</a>. Eating 10-30% fewer calories than you &#8220;need&#8221;, but all the nutrition. With Goop, CR with full nutrient intake becomes easy. I will practice a mild CR, because full-blown CR is inconvenient.</p>

<p>I am almost 2 meters tall and I burn 500-600 calories a day biking to and from work. I also run regularly and do some strength training. I need more calories than most, but let&#8217;s start out with the usual assumption of a target around 2000 calories.</p>

<p>We get calories from protein, carbohydrates and fat. How much you should get from each group is a source of fierce debate. The safest bet seems to be that carbs should be somewhere between 30% and 50% of daily calorie intake, fat around 30% and protein between 15% and 30%. That is generally less carbs than in a regular western diet. On the other hand we are not aiming for so few carbs as to induce <a href="http://en.wikipedia.org/wiki/Ketosis">ketosis</a> which <em>may</em> help people lose weight somewhat faster, but does not seem generally conductive to health.</p>

<p>Eating too much protein can also be unhealthy. The liver and kidneys are <a href="http://www.betterhealth.vic.gov.au/bhcv2/bhcarticles.nsf/pages/protein?open">put under strain</a> and you may lose calcium and thus bone mass. You have to make sure you drink a lot of water and adequate calcium if you consume 1/3 of your calories from proteins. In the very well researched <a href="http://www.fantastic-voyage.net/ShortGuidehtml.htm">Fantastic Voyage</a>, Ray Kurzweil and Terry Grossman advocate 35% of calories from protein in general and 55% for the rather extreme diet for those with diabetes II. Eating more protein than 1g/kg body mass will not make you gain muscle faster.</p>

<p>When it comes to fat, it is probably good for you to have ratio of Omega 6 / Omega 3 consumption somewhere between 2:1 and 1:1. The modern diet has way too much Omega 6, which is inflammatory and bad for your heart.</p>

<p>Regarding carbs, we want to have slow carbs that the body digests slowly. This helps your body keep your blood sugar and insulin levels steady, which protects you from diabetes and wards of hunger. In a normal diet we also want a lot of fibers. Fibers makes you digest more slowly and feed your intestinal bacteria. If you are not used to food with high fiber content, you need to increase fiber intake slowly, since you need to build up your gut flora. Goop contains 50-60 grams of fibers each day, which probably takes some getting used to. A common guideline is that your diet should contain at least 30g fibers per day.</p>

<h2>Methionine</h2>

<p>When it comes to protein, things get a little tricky. Normally you would just want protein with an amino acid profile close to your own, which makes meat, eggs or <a href="http://en.wikipedia.org/wiki/Whey_protein">whey protein</a> powder a good choice, unless you are lactose intolerant.</p>

<p>Calorie restriction, see above, seems to induce slower aging. Interestingly, so does protein restriction, with some caveats. Even more intrestingly, restricting just the amino acid methionine (a part of normal protein) also seems to induce slower aging (see <a href="http://en.wikipedia.org/wiki/Methionine#Methionine_restriction">methionine restriction</a>). Methionine is an essential amino acid, so we can&#8217;t just cut it out of our diet, but restricting it is possible, by eating certain vegetarian proteins and not too much of them. Soy is one protein source that has comparativel little protein. It has gotten some bad press, but it is probably safe. Nevertheless perhaps pea protein is the most attractive alternative.</p>

<p>Long term methionine restriction is AFAIK completely untested in humans, so I wouldn&#8217;t advocate it or try it myself. I do think it is prudent to make sure you don&#8217;t get <a href="http://arc.crsociety.org/read.php?2,205045,205045#msg-205045">excess methionine</a>, though.</p>

<h2>Intermittent fasting and carb concentration</h2>

<p>I practice intermittent fasting (IF) once or twice a week, fasting 24 hours on water and green tea. The evidence for the benefit of different kinds of IF, both in animal studies and humans (think religious fasting) has reached a thoroughly convincing level.</p>

<p>The first 1 - 3 times I was really hungry, but the body quickly adapts and it becomes easier. There is however an alternative possibility, which may give the same benefits. <a href="http://catalyticlongevity.org">Carbohydrate-concentration</a>. Instead of fasting, you only carb-fast. That is, you consume all of the carbohydrates for each day in one meal. It does not matter which one.</p>

<p>Carb concentration is less well researched than intermittent fasting, but the effects on blood sugar are fairly well understood and it looks promising. I might try carb concentration and see how convenient it feels, since it is quite easy to divide the components of a shake anyway one sees fit. It is not something I will try immediately, though.</p>

<h2>Other nutrients</h2>

<h3>Vitamins</h3>

<p>We need vitamins to live. It is fairly well-known how much we need to survive, but the optimum levels are a source of debate. Many (probably all) vitamins are anti-oxidants, protecting your body from free radicals. According to the free-radical theory of aging, this should be very important. At the same time studies don&#8217;t really find life-extending effects from consuming large amounts of anti-oxidants. At the same time, meta-studies that have found deterimental effects seem quite silly, e.g including patients that take large quantities of vitamins for some serious ailment.</p>

<p><a href="http://en.wikipedia.org/wiki/Vitamins">Here</a> is a list of the recommended daily allowances.</p>

<p>If we cover the vitamin need with a multi-vitamin, it is <a href="http://www.livestrong.com/article/302783-daily-dose-the-vitamins-you-should-or-shouldnt-be-taking/">probably</a> best to take it in divided doses with &#8220;meals&#8221; (at least fat), for proper uptake. There are multi-vitamins made to be taken in divided doses, for example these [Two-per-day capsules]
(http://www.lef.org/Vitamins-Supplements/Item01714/Two-Per-Day-Capsules.html). Otherwise you can just split one, if it is dry.</p>

<p>Almost all multi-vitamins come with the alpha form of vitamin E, which some say is a problem. This might displace the other forms of vitamin E, and actually make you deficient. In Goop, I will get plenty of gamma-tocopherol from the flaxseed oil, so it is probably OK. Also, <a href="http://lpi.oregonstate.edu/ss03/vitamine.html">this nutritional researcher</a> notes that it might be a confusion of cause and effect and that low gamma-tocopherol may just be a indicator that something else is wrong.</p>

<p>Vitamin K is not a part of most multi-vitamins, so we need that through our diet.</p>

<h3>Dietary minerals</h3>

<p><a href="http://en.wikipedia.org/wiki/Trace_minerals">Here</a> are the RDAs for minerals. You could get most of them from a decent multi-vitamin, but those below usually need other sources.</p>

<p>Phosphorus deficiency is very rare, so there are almost no phosphorus supplements for humans (but plenty for horses).</p>

<p>Calcium is usually not included in a mutli-vitamin in enough quantities, just a little to go with the vitamin D.</p>

<p>Potassium is a lethal poison if you consume too much of it, so you have to be very careful when supplementing. This is why potassium pills are rare and usually far below the RDA.</p>

<p>Iron is an oxidant, so you don&#8217;t want to get too much. You also don&#8217;t want to be deficient, making you anemic and lethargic. Menstruating women need more than men.</p>

<p>Copper, like iron, should not be consumed to excess. I have never seen a multi-vitamin with more than 50% RDA (RDA is 2g) copper.</p>

<p>Sodium and chlorine are easily obtained from table salt. Most things get tastier with a little salt anyway.</p>

<h3>Phytochemicals</h3>

<p> <a href="http://en.wikipedia.org/wiki/Phytochemicals">Phytochemicals</a> (or phytonutrients) are difficult. They are the tens of thousands of possibly protective substances in plants. They are not absolutely essential to life, but they <em>may</em> protect against cancer, stroke, macular degeneration, inflammation and all sorts of things.</p>

<p><a href="http://www.webmd.com/diet/phytonutrients-faq">WebMD</a> suggests that the most important are <a href="http://en.wikipedia.org/wiki/Carotenoids">carotenoids</a>, <a href="http://en.wikipedia.org/wiki/Ellagic_acid">ellagic acid</a>, <a href="http://en.wikipedia.org/wiki/Flavonoids">flavonoids</a>, <a href="http://en.wikipedia.org/wiki/Resveratrol">resveratrol</a>, <a href="http://en.wikipedia.org/wiki/Glucosinolate">glucosinolates</a> and <a href="http://en.wikipedia.org/wiki/Phytoestrogens">phytoestrogens</a>.</p>

<p>It is very difficult to come to some sort of conclusion. The effects are certainly not obviously beneficial. Carotenoids, resveratrol and glucosinolate seems the most interesting, to me, though especially resveratrol is controversial.</p>

<h3>Hormesis</h3>

<p>The matter is further complicated by the issue of <a href="http://en.wikipedia.org/wiki/Hormesis">hormesis</a>. Protecting from oxidant damage is good, but if you protect too well it seems to make it worse. For example, the positive effects of excercise (which inflicts oxidation damage to your body) can be negated by anti-oxidants. It seems that only if you stress your cells <em>just enough</em>, good things happen.</p>

<p>In my personal opinion hormesis is the single most confounding factor of longevity science. It is just like with training a muscle. If you train it too much, you may overtrain and it becomes weaker instead of stronger. If you don&#8217;t get any oxidative stress, you don&#8217;t &#8220;train&#8221; your body and trigger the beneficial heat shock responses and gene expressions, but if you get too much you &#8220;overtrain&#8221; and just break down your cells. Since so many different things adds oxidative stress and so many other things protect from it and since it seems almost impossible to know (I suppose blood tests could one day be developed) if you get too much or too little, it is very hard to know what anti-oxidants will do to your body.</p>

<h2>The actual recipe for one day</h2>

<p>Where possible and not too expensive, I will use organically grown ingredients, to avoid pesticides. I <a href="http://www.bbc.co.uk/news/health-19465692">don&#8217;t think the benefit is large</a>, so I won&#8217;t go to special stores or pay a premium of more than 10% or so. Oat, broccoli, berries and tomatoes are the ingredients below that I think might benefit from not having pesticides.</p>

<h3>500g plain 3% fat, unsweetened yoghurt or sour milk</h3>

<p>This adds protein, calcium, potasisum and some possibly beneficial gut bacteria.</p>

<h3>300g oat brans</h3>

<p>This is the main source of low GI-carbs, copper and iron. It is also a good source of phosphorus and potassium. The fiber will make you feel satiated longer and in particular the <a href="http://en.wikipedia.org/wiki/Beta-glucan#Research">beta-glucans</a> in oat, which lowers cholesterol, in particular LDL. According to the Wikipedia link above, beta-glucans may also boost your immune system and have some anti-cancer properties. Also, it is a whole-grain that contains no gluten, which is trendy to avoid even amongst non-celiacs because of its allegedly inflammatory nature. I don&#8217;t think the final verdict on gluten is in yet. Bear in mind, though, that some people with celiac disease are also allergic to oats.</p>

<h3>40g flaxseed oil</h3>

<p>This is the main source of fat and that which restores the Omega 3 - Omega 6 balance. Flaxseed oil is a rich source of alpha-lipoic-acid and some people take it as a nutritional supplement. Strongly anti-inflammatory.</p>

<h3>150g Broccoli</h3>

<p>150g is supposed to be the average weight of one stalk of broccoli. Exact dosage is not terribly important, but it is the main source of vitamin K, so not below 75g. Broccoli is a &#8220;leafy-green&#8221;, which counts as the most healthy of the vegetable families. It is rich in phytosterols, which fight cholesterol. It also mildly anti-inflammatory.</p>

<h3>50g almonds</h3>

<p>Almonds <a href="http://www.sfgate.com/business/article/Studies-Show-that-Almonds-Help-Regulate-Blood-2307482.php">control blood sugar</a> and makes you feel full longer. Mildly anti-inflammatory.</p>

<h3>Flavoring</h3>

<p>Besides the five main ingredients, I have also added four flavorings.</p>

<p><em>10g <a href="http://www.healthdiaries.com/eatthis/10-health-benefits-of-cinnamon.html">cinnamon</a></em>, which is most famous for regulating blood sugar and lowering LDL.</p>

<p><em>20g cocoa powder</em>, which are rich in flavonoids, are linked to health benefits like lowering coronary heart disease and stroke risk. Also it tastes nice.</p>

<p><em>10g sun-dried tomatoes</em>. 10g is about five dried tomatoes. I suppose I could use fresh tomatoes as well, but the drying process preserves the nutrients and makes them easier to transport and handle. Tomatoes contain <a href="http://en.wikipedia.org/wiki/Lycopene#Preliminary_research_and_potential_health_benefits">lycopene</a>. Tomato consumption has been associated with decreased risk of certain cancers.</p>

<p><em>100g frozen unsweetened berries</em>. Mostly bilberries, but any berry will do. Full of phytochemicals.</p>

<p>I don&#8217;t know yet how I will mix these flavorings. Perhaps I will just throw them together. Perhaps I will keep them in separate meals.</p>

<p>If it needs to be sweeter, I will add <a href="http://en.wikipedia.org/wiki/Stevia#Folk_medicine_and_research">stevia</a> which must be far and away the best sweetener. Wikipedia: &#8220;Current research has evaluated its effects on obesity and hypertension. Stevia has a negligible effect on blood glucose, and may even enhance glucose tolerance&#8221;.</p>

<h3>Sodium</h3>

<p>Goop contains very little salt, so I&#8217;ll need to add 5-10g. How much salt you lose when working out seems to vary between individuals, but you lose roughly 1g per liter sweat and sweat roughly 1 liter from a 1 hour run. 30-50% of the population are sensitive to too much salt and may get high blood pressure. For the rest, it does not seem to make a difference.</p>

<h3>Vitamins</h3>

<p>The RDA for minerals are easy enough to satisfy, but vitamin D is damn near impossible to get enough of from natural sources. The only possible exception is fatty fish, but I don&#8217;t want that in a shake&#8230; If you live in a dark country and/or spend your working hours inside, like I do, getting enough vitamin D is even more important. Goop as it stands, also lacks vitamin A, but that could be easily remedied with a carrot a day. I will take a daily multi-vitamin, but with a carrot, that could probably be downgraded to just vitamin D.</p>

<p>Vitamin C is unstable over time in the presence of water and I think others may react with iron, so it is probably best not to add vitamins to the mix, but swallow them separately.</p>

<h3>Fiber</h3>

<p>Goop contains loads of soluble fiber, which amongst other things is good for lowering cholesterol. The high fiber content (56g) means that my stomach will have to get used to it gradually, by building up suitable gut bacteria. It also means that I have to drink a lot of water.</p>

<h3>Adding it up</h3>

<p>This comes to 2131 calories per day, with 198g carbs, 105g fat and 76g protein. The food is very anti-inflammatory and has a low glycemic load. The <a href="http://www.axa.se/Vara-produkter/Gryn/Havregryn/">oat bran I use</a> seem to have a slightly different nutritional value than those in the <a href="http://nutritiondata.self.com/facts/cereal-grains-and-pasta/5703/2">databases</a>, so YMMV.</p>

<p>For a physically active male, even if he would want mild calorie restriction, this is probably too few calories. There is room for adding more carbs and protein, but preferably not more fat. Fruits, carrots, beans, protein powder and lean meat are possible additions. I suppose recovery drinks or bars after exercise is one simple, if somewhat expensive and sugary, way to compensate with more carb and protein calories. I will experiment a bit with this.</p>

<p>If I wanted less calories, let&#8217;s say for dieting, I would cut up to half the almonds, half the flaxseed oil and 100g of the oatbran. It is not supposed to be healthy to lose more than 0.5 kg (1 pound) per week, which is about 600 calories per day below your normal intake, so don&#8217;t go overboard.</p>

<p>I will just mix the ingredients and some water together to a shake or drinkable substance. I will try to mix just a little, since a smooth goo would raise the glycemic index. I still want to see oats. For variation, oat brans (say 200g), water (3 dl, pour over when boiling), oil (at most ½ dl), stevia and cinnamon or cacao as a flavouring can be blended (once again, not too smooth) and spread thinly and baked in the oven in 150 degrees C for 50-60 minutes, for a dry, not quite delicious but nutritious cookie. Since it is dry, it will be edible for many days. The low temperature will not oxidize the oil or destroy other nutrients.</p>

<p>I will also eat sugar-free gum after meals, to make sure that my gums get some workout. Also a <a href="http://www.deltadentalins.com/oral_health/heart.html">clean mouth</a> is imperative for a healthy heart.</p>

<h3>Experiment</h3>

<p>As an experiment, I will consume Goop exclusively for a few weeks (at least three) and measure blood pressure, weight, body composition (if I can find my body composition scale), resting heart rate, mental ability, athletic ability and general well being continuously. A few people I know have expressed interest in doing the same. It would be interesting to measure various markers from my blood as well. Unfortunately, where I live this is not really offered as a separate service but rather as part of a complete health exam, which is quite expensive to do twice in succession. I will probably do it afterwards, though.</p>

<p>I will not go cold turkey from normal food immediately, but increase gradually, mainly to make sure I can handle the fiber content.</p>

<p>Adding nootropics and stimulants, like the original Soylent guy did, is also interesting. However, to avoid complicating things, I will add this later. I don&#8217;t really know which substances, but at least ginkgo biloba.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[A better algorithm for backups and rolling logs]]></title>
    <link href="http://gurgeh.github.com/blog/2013/02/22/a-better-algorithm-for-backups-and-rolling-logs/"/>
    <updated>2013-02-22T17:46:00+01:00</updated>
    <id>http://gurgeh.github.com/blog/2013/02/22/a-better-algorithm-for-backups-and-rolling-logs</id>
    <content type="html"><![CDATA[<p>In your <code>/var/log/</code> you will most probably have logs that have grown too large and rolled over. Per default your system logger gzips and stores a few of the older ones and finally when they get too numerous, it just deletes them. Same thing for log handlers in most languages, for example Python&#8217;s RotatingFileHandler. Backups are usually also handled the same way, when you don&#8217;t want to store every backup.</p>

<p>That is almost never what I want. If I store only N backups, I don&#8217;t want them to be only the very latest. If they are close in time they will be more similar and in some sense contain less information than backups that are spread out over time. Of course the newer ones are probably on average more interesting to me, so I don&#8217;t want them evenly spread over time. The following simple algorithm is my suggestion for a better way to handle rotating logs and backups.</p>

<!--more-->


<h2>The algorithm</h2>

<p>If we call the day the very first backup is made &#8220;day 0&#8221;, the current state can be defined as a list of integers. If we currently have backups from day 0, 15, 20, 25 and 28, this is represented with the list <em>[0, 15, 20, 25, 28]</em>. Let us say that we want a maximum of 5 backups, then when we add the backup for day 29, we have to decide which backup to discard. We do this by rating each of the 6 possible (since we have 6 different numbers to potentially discard) configurations with a fitness function and chosing the best one. A lower score is better.</p>

<p>The optimal fitness function will be different from case to case. In essence it should be a balance, set by a parameter, between how valuable it is to have recent data and how interesting it is to have the data well spread out. As an example of a parameterless logarithmic fitness function, consider the following function, which rates one configuration by punishing each point by how far it is from it&#8217;s ideal logarithmic position:</p>

<figure class='code'> <div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
</pre></td><td class='code'><pre><code class='python'><span class='line'><span class="kn">import</span> <span class="nn">math</span>
</span><span class='line'>
</span><span class='line'><span class="k">def</span> <span class="nf">expfit</span><span class="p">(</span><span class="n">backups</span><span class="p">,</span> <span class="n">now</span><span class="p">):</span>
</span><span class='line'>    <span class="n">exponent</span> <span class="o">=</span> <span class="n">math</span><span class="o">.</span><span class="n">log</span><span class="p">(</span><span class="n">now</span><span class="p">)</span> <span class="o">/</span> <span class="n">math</span><span class="o">.</span><span class="n">log</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">backups</span><span class="p">))</span>
</span><span class='line'>
</span><span class='line'>    <span class="k">def</span> <span class="nf">exp_deviation</span><span class="p">(</span><span class="n">backup</span><span class="p">,</span> <span class="n">backupnr</span><span class="p">):</span>
</span><span class='line'>        <span class="k">return</span> <span class="nb">abs</span><span class="p">(</span><span class="n">now</span> <span class="o">-</span> <span class="n">backup</span> <span class="o">-</span> <span class="n">backupnr</span> <span class="o">**</span> <span class="n">exponent</span><span class="p">)</span>
</span><span class='line'>    <span class="k">return</span> <span class="nb">sum</span><span class="p">(</span><span class="n">exp_deviation</span><span class="p">(</span><span class="n">b</span><span class="p">,</span> <span class="n">bnr</span><span class="p">)</span>
</span><span class='line'>               <span class="k">for</span> <span class="n">bnr</span><span class="p">,</span> <span class="n">b</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="nb">reversed</span><span class="p">(</span><span class="n">backups</span><span class="p">)))</span>
</span></code></pre></td></tr></table></div></figure>


<p>I have stored the code in an online interpreter called <a href="http://repl.it/ICA/3">repl.it</a>. You can try it out by clicking the Play button and then typing <code>expfit([0, 15, 20, 25, 28], 29)</code> and for example compare with replacing your most recent backup with todays, <code>expfit([0, 15, 20, 25, 29], 29)</code>.</p>

<p>To simulate how it would look after a certain time, I use the following code (also loaded in the session above):</p>

<figure class='code'> <div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
</pre></td><td class='code'><pre><code class='python'><span class='line'><span class="k">def</span> <span class="nf">tryall</span><span class="p">(</span><span class="n">backups</span><span class="p">):</span>
</span><span class='line'>    <span class="n">remove</span> <span class="o">=</span> <span class="nb">min</span><span class="p">((</span><span class="n">expfit</span><span class="p">(</span><span class="n">backups</span><span class="p">[:</span><span class="n">i</span><span class="p">]</span> <span class="o">+</span> <span class="n">backups</span><span class="p">[</span><span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">:],</span> <span class="n">backups</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]),</span> <span class="n">i</span><span class="p">)</span>
</span><span class='line'>                  <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">xrange</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">backups</span><span class="p">)))[</span><span class="mi">1</span><span class="p">]</span>
</span><span class='line'>    <span class="k">return</span> <span class="n">backups</span><span class="p">[:</span><span class="n">remove</span><span class="p">]</span> <span class="o">+</span> <span class="n">backups</span><span class="p">[</span><span class="n">remove</span> <span class="o">+</span> <span class="mi">1</span><span class="p">:]</span>
</span><span class='line'>
</span><span class='line'><span class="k">def</span> <span class="nf">simulate</span><span class="p">(</span><span class="n">nrbackups</span><span class="p">,</span> <span class="n">targetnow</span><span class="p">):</span>
</span><span class='line'>    <span class="n">backups</span> <span class="o">=</span> <span class="nb">range</span><span class="p">(</span><span class="n">nrbackups</span><span class="p">)</span>
</span><span class='line'>    <span class="k">for</span> <span class="n">now</span> <span class="ow">in</span> <span class="nb">xrange</span><span class="p">(</span><span class="n">nrbackups</span><span class="p">,</span> <span class="n">targetnow</span> <span class="o">+</span> <span class="mi">1</span><span class="p">):</span>
</span><span class='line'>        <span class="n">backups</span> <span class="o">=</span> <span class="n">tryall</span><span class="p">(</span><span class="n">backups</span> <span class="o">+</span> <span class="p">[</span><span class="n">now</span><span class="p">])</span>
</span><span class='line'>    <span class="k">return</span> <span class="n">backups</span>
</span></code></pre></td></tr></table></div></figure>


<p>Run it by <code>simulate(10, 100)</code> which should yield <em>[18, 30, 44, 59, 70, 84, 92, 96, 99, 100]</em>. That list looks like rather a nice set of backups after 100 days, to me.</p>

<h2>A functional reactive programming simulation in Elm</h2>

<p>Visualising this seemed like a perfect excuse to try out <a href="http://elm-lang.org/">Elm</a> which is a purely functional programming language which can handle input and output with the cool <a href="http://elm-lang.org/learn/What-is-FRP.elm">Functional Reactive Programming</a> technology. I have put the resulting simple simulation <a href="http://gurgeh.github.com/assets/simulation.html">here</a>.</p>

<h2>Possible improvements</h2>

<ul>
<li>The fitness function can potentially be made better, for example by incorporating a parameter as I described.</li>
<li>The algorithm is currently greedy, just using the best configuration for each step. It could make Monte Carlo simulations to see if a currently worse configuration will be better in the long run.</li>
<li>For some applications, early versions of a file are smaller than late versions. The fitness function could incorporate that you want to use a certain number of bytes for backups and be allowed to decide how many backups to keep as long as it is under this limit. This would skew it towards keeping more of the earlier but &#8220;cheaper&#8221; backups.</li>
<li>I could have checked if someone has already done something similar before I played with this, but that would not have been as much fun, now would it?</li>
</ul>


<p>I have a Github project called <em>checkpoint</em> with source files for Python, Haskell and Elm <a href="http://github.com/gurgeh/checkpoint">here</a>.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Pre-programming Mental Silence Meditation with Entrainment]]></title>
    <link href="http://gurgeh.github.com/blog/2013/01/24/pre-programming-mental-silence-meditation-with-entrainment/"/>
    <updated>2013-01-24T16:01:00+01:00</updated>
    <id>http://gurgeh.github.com/blog/2013/01/24/pre-programming-mental-silence-meditation-with-entrainment</id>
    <content type="html"><![CDATA[<p>I am not a dirty hippie. No, honestly, I&#8217;m not. I am, in fact, a grumpy, skeptical, philosophically materialist atheist. Yet, on and off for the last 15 years I have practiced meditation.</p>

<p>It feels good, helps me relax and it makes my mind feel clear, perceptive and sharp afterwards. It was only recently that I started doing it in conjunction with programming. The match was so good, that I felt the proselytic urge to share what I had found. My example is programming, but it can obviously be any intensive, possibly creative, mental task, such as writing, studying or playing Yahtzee.</p>

<!--more-->


<h2>Why</h2>

<p>In the 60s and 70s all sorts of fantastical and simply untrue claims were made about meditation. Unfortunately that can make people zone out when they read about recent research of higher quality that shows many real effects. See for example <a href="http://en.wikipedia.org/wiki/Research_on_meditation">this Wikipedia page on meditation research</a>.</p>

<p>Meditation has, among other things, been shown to increase happiness and well-being and <a href="http://www.kurzweilai.net/mindfulness-meditation-training-changes-brain-structure-in-8-weeks">physically change structures</a> in <a href="http://www.kurzweilai.net/meditation-may-change-brains-physical-structure-strengthen-connections">the brain</a> related to those things. It improves short-term memory. It may decrease blood pressure and risk of heart attack.</p>

<p>But you don&#8217;t actually have to care about any of those things. The effects after a 15 (or possibly 10 if you&#8217;re really in a hurry) minute &#8220;successful&#8221; meditation are so obvious that you should be able to evaluate if it is worth it to you, on that alone. Many do longer than 15 minutes. 30 minutes or more is not unusual. But perhaps you have a busy day - 15 minutes is enough.</p>

<h3>Programming or problem-solving</h3>

<p>Before settling down to meditate, decide on what you want to work on next. If there is some creativity or problem-solving involved, start by very quickly going through the problem in your mind, just long enough so that you have put what needs to be done within what constraints into words.</p>

<p>Now here&#8217;s the trick - don&#8217;t solve or design anything yet. Fight that urge. Let those thoughts vanish for now. It is especially important that you don&#8217;t think about the task during the actual meditation. Your subconscious will take care of that.</p>

<p>Afterwards you will be in a mentally refreshed state, ready to jump in and create. It feels different.</p>

<h3>Procrastination</h3>

<p>Meditation has always worked well for me as a cure for procrastination. If you need to do something that you have been putting off, it is easy to convince your mind that you might do it even better after some meditation. Easy, because that let&#8217;s you avoid the task even longer. However, promise yourself that you will do it right after you are done.</p>

<p>Sit for how long you want. If you get impatient after a while and long to do something, great! Just stop and do your task. Right after meditation you will be in a very special state. Facing that task, as long as it is well-defined, will probably no longer feel like anything special at all.</p>

<h2>How</h2>

<p>When I write &#8220;meditation&#8221;, I mean evoking the relaxation response through mental silence. The basic concept is really stupendously easy. 1) Just sit. 2) And don&#8217;t have an inner dialogue.</p>

<p>First part you can do, right? I suppose you can lie down as well, but we are not going for sleepy here. Thus sitting seems to convey the right sense of calm wakefulness. Also, if you are doing this in an office, sitting meditation in your chair might be slightly unusual, but getting down and lying on the floor will be too full on eccentric for most people.</p>

<p>But the second part. No inner dialogue. That usually scares people. &#8220;I can never do that&#8221;. &#8220;Even if you think about not thinking, that is thinking too&#8221;, etc. This is where various techniques come in. You will have to find out what works for you.</p>

<p>The most important thing to know is that you will have thoughts. Your inner dialogue will notice the silence and try to kickstart all sorts of new interesting internal conversations. This is not failure or you doing something wrong, it is what brains do. Just notice what happens and, no matter how tempting and interesting that thought seemed, let it pass. It will come back to you when you are done. Redirect your attention to something else. How good it feels to breathe out, for example. The longer you sit, the more space it will be between these attempts of your brain to start the dialogue again. Also, as you meditate more often, your thoughts will get sparser quicker.</p>

<p>When you start out, or when you pick up the habit after a long hiatus, you will have lots of meta thoughts of the type &#8220;Hmm.. how am I doing?&#8221;, &#8220;Is this right?&#8221;, &#8220;This does feel kinda good&#8221; and &#8220;OMG, my mind was totally silent for a bit there!&#8221;. That, too, is normal. When you catch yourself doing it, don&#8217;t judge. Just let the thought go.</p>

<p>It tends to be easier if you close your eyes, I think. But feel free to gaze at something calming or hypnotic or just into the distance. For something to focus on besides the mind&#8217;s chatter, try one or two of the suggestions below.</p>

<h3>Breathing</h3>

<p>One technique is to focus on your breathing. In through your nose, out through your mouth. Slowly. Focus on how it feels. When words come, return to your breathing. Some people imagine breathing in energy or &#8220;good stuff&#8221; or whatever. Some people count their breaths to 10 and then over from 1 again. I don&#8217;t use that myself, because counting feels, to me, slightly like dialogue. It is effective at stopping other words, though.</p>

<h3>Imagining</h3>

<p>Imagine you are a mountain. A patient, wise Japanese one. Sit like a mountain. Or, imagine that your mind is a completely still lake or pool. No ripples. Calm.</p>

<p>Imagine a spot of light in the middle of your forehead. Or maybe a laser. Shine it.</p>

<p>Compassionate meditation is an alternative that can feel good. Imagine a small sphere around you. Wish everything within it well. Give everything within compassion and love. Slowly expand your sphere. Smile and breathe.</p>

<h3>Getting started</h3>

<p>Getting started can be the tricky part. One way to help this is to borrow techniques from hypnosis. For example, with each slow breath you may image yourself taking one step further down a ten step stair that leads to a door. When opened, the door leads to a calm nature scene that you have chosen in advance. You sit down and start your meditation.</p>

<p>Also it helps if you are in a reasonably calm state to begin with. If you are all wound up, mind racing, maybe stressed about something, it will be much harder to let yourself settle down and relax.</p>

<h2>Brainwave entrainment</h2>

<p>The most effective way I know of reaching a meditative state is through sound. Specifically, through <a href="http://en.wikipedia.org/wiki/Brainwave_entrainment">brainwave entrainment</a>. I know. Just the word &#8220;brainwave&#8221; makes it sound like such a load of bullcrap.</p>

<p>Just like with meditation there are some weird claims about entrainment, like being able to overcome all sorts of maladies, heal wounds faster, etc. Ignore those things. If any of those benefits exists at all, they are just secondary effects of being able to relax and sleep better.</p>

<p>What certainly is real, though, is that our brain sends out electromagnetic waves of various frequencies. Depending on what state we are in, different frequencies dominate. <a href="http://en.wikipedia.org/wiki/Electroencephalography#Comparison_table">Here</a> is a table of different frequencies and when they dominate. Meditation happens in the alpha and theta regions.</p>

<p>What is also real, is that listening to auditory pulses of specified frequencies makes your brain emit those same frequencies stronger. Thus, listening to pulses of theta or alpha frequencies (4 - 13 Hz) makes you emit more of those. Of course the association between relaxation and alpha waves might not go both ways. This might just be a mixup of cause and effect and making the brain emit more alpha waves might not make you more relaxed.</p>

<p>Well, studies show that they do. Less interesting to you, perhaps, is that anecdotal evidence from myself and everyone I know that have tried it (including my infant girl, who I put to sleep that way a few times..) is that it makes you go into a meditative state (or, if you like the terminology better, elicits the relaxation response) faster than anything else. It might just be placebo, but given the evidence above, it is probably not. Also, it will work on the first try, so it really is very easy for you to try it out for yourself.</p>

<p>There are many programs that help you put together these sort of frequency curves, add background noise and stuff. <a href="http://www.transparentcorp.com/products/np/">Neuro-programmer 3 Home</a> is very good, but only for Windows (rumoured to work on Linux + Wine) and a tad expensive. I recommend <a href="http://gnaural.sourceforge.net/">Gnaural</a>, which gives you very detailed control, is free open source and works on Win, Mac and Linux.</p>

<p>There are many builtin examples, but I have posted a file that Gnaural can read, with a 15 minute 4-5 Hz meditation with isochronic pulses for headphones <a href="https://gist.github.com/4673317">here</a>. After the meditation phase is done (you will know without a doubt when that phase is done, because the higher frequencies will gently wake you up) I have put in 30 minutes of 40 Hz. If it helps you with your productive phase, let it run. If it gives you a headache, turn it off.</p>

<p>Another good thing about entrainment is that it tells you when you are done, so you don&#8217;t have to wonder about how long you have been sitting.</p>

<p>So, in conclusion, don&#8217;t put it off. Take 15 minutes and try it right now. Then tell me in the comments what you thought.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Compile time loops in C++11 with trampolines and exponential recursion]]></title>
    <link href="http://gurgeh.github.com/blog/2012/11/22/compile-time-loops-in-c-plus-plus-11-with-trampolines-and-exponential-recursion/"/>
    <updated>2012-11-22T15:05:00+01:00</updated>
    <id>http://gurgeh.github.com/blog/2012/11/22/compile-time-loops-in-c-plus-plus-11-with-trampolines-and-exponential-recursion</id>
    <content type="html"><![CDATA[<p>Nowadays with <em>constexpr</em>, we can make pure functions that are
calculated compile time. The restriction on these functions is that
they may only consist of one statement (though additional lines with
typedefs, static_assert, etc are OK), and this statement may only call
other constexpr functions.</p>

<p>If-statements are easy - just use the ternary ?: operator or, if you
want to show off, template specialization. But the only way to loop
with these restrictions is by using recursion. Furthermore, neither
clang 3.1 nor GCC 4.7 (nor the current build of GCC 4.8) support tail
call elimination in these constexprs, so normal linear loops will
still eat stack space if we loop for a while. Also, the standard
recommends that the default maximum recursion depth should be 512,
which means that if we want to do something silly/fun/interesting
compile time, we have to mess with proprietary compiler switches to
get the program to compile. No fun.</p>

<p>In this post, I show one way of working around those limitations.</p>

<!--more-->


<p>But first - why would one want to do constexpr meta programming, when
we already have template meta programming? Constexprs and templates
are almost exactly as powerful, except that template meta programming
is lazy, which some might think fits better into a functional style. A
more important difference is that template meta programming can work
with types, which means that we can use it for cool type-level
correctness stuff. So, once again, why would we use constexprs?
1) Constexprs are evaluated several times faster in the compilers I know
of, which can be important. 2) The syntax is much nicer.</p>

<h2>Simple recursion</h2>

<p>Simple recursion is not a problem:</p>

<figure class='code'><figcaption><span>Simple constexpr recursion - check primality</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
</pre></td><td class='code'><pre><code class='c++'><span class='line'><span class="n">constexpr</span> <span class="kt">bool</span> <span class="n">isprime</span><span class="p">(</span><span class="kt">unsigned</span> <span class="kt">long</span> <span class="n">n</span><span class="p">){</span>
</span><span class='line'>  <span class="k">return</span> <span class="n">n</span> <span class="o">==</span> <span class="mi">2</span> <span class="n">or</span> <span class="n">n</span> <span class="o">==</span> <span class="mi">3</span> <span class="n">or</span> <span class="p">(</span><span class="n">n</span> <span class="o">%</span> <span class="mi">2</span> <span class="n">and</span> <span class="n">rec_isprime</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="mi">3</span><span class="p">));</span>
</span><span class='line'><span class="p">}</span>
</span><span class='line'>
</span><span class='line'><span class="n">constexpr</span> <span class="kt">bool</span> <span class="n">rec_isprime</span><span class="p">(</span><span class="kt">unsigned</span> <span class="kt">long</span> <span class="n">n</span><span class="p">,</span> <span class="kt">unsigned</span> <span class="kt">long</span> <span class="n">div</span><span class="p">){</span>
</span><span class='line'>  <span class="k">return</span> <span class="n">n</span> <span class="o">%</span> <span class="n">div</span> <span class="o">==</span> <span class="mi">0</span> <span class="o">?</span>
</span><span class='line'>           <span class="kc">false</span>
</span><span class='line'>         <span class="o">:</span> <span class="p">(</span><span class="n">div</span> <span class="o">*</span> <span class="n">div</span> <span class="o">&gt;</span> <span class="n">n</span> <span class="o">?</span> <span class="kc">true</span> <span class="o">:</span> <span class="n">rec_isprime</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">div</span> <span class="o">+</span> <span class="mi">2</span><span class="p">));</span>
</span><span class='line'><span class="p">}</span>
</span><span class='line'>
</span><span class='line'><span class="cp">#include&lt;iostream&gt;</span>
</span><span class='line'>
</span><span class='line'><span class="kt">int</span> <span class="n">main</span><span class="p">(){</span>
</span><span class='line'>  <span class="c1">//constexpr in declaration guarantees that it is evaluated compile time</span>
</span><span class='line'>  <span class="n">constexpr</span> <span class="kt">bool</span> <span class="n">p1</span> <span class="o">=</span> <span class="n">isprime</span><span class="p">(</span><span class="mi">997</span><span class="p">);</span>
</span><span class='line'>  <span class="n">constexpr</span> <span class="kt">bool</span> <span class="n">p2</span> <span class="o">=</span> <span class="n">isprime</span><span class="p">((</span><span class="mi">1</span> <span class="o">&lt;&lt;</span> <span class="mi">20</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">);</span>
</span><span class='line'>  <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">p1</span> <span class="o">&lt;&lt;</span> <span class="s">&quot; &quot;</span> <span class="o">&lt;&lt;</span> <span class="n">p2</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<p>This program will reach a depth of at the most sqrt(2<sup>20),</sup> which is
2<sup>10</sup> = 1024, and since we check at most slightly less than half of
those, we will be fine. Since 2<sup>20</sup> - 1 is not a prime number, we will
be even more fine. However if we try check if 2<sup>31</sup> - 1 is prime, the
compiler will refuse to do so. In this case we can fix it by
increasing the maximum constexpr evaluation depth for our compiler,
but in general that might take too much RAM, so lets do our own tail
call elimination with <a href="http://en.wikipedia.org/wiki/Continuation-passing_style">continuation passing style</a> and a
<a href="http://en.wikipedia.org/wiki/Trampoline_(computing)">trampoline</a>
instead.</p>

<p>This way, we have a trampoline function that takes a CPS function as
an argument and loops, continously calling the continuation returned
from the previous call.</p>

<h2>Exponential recursion</h2>

<p>In order to use a trampoline, we still need an infinite or near
infinite loop for the basic trampoline function. This is where
exponential recursion comes in. The recursion in <code>rec_isprime</code> behaves
sort of like a linked list - each invokation links to the next. If
each <code>rec_isprime</code> statement instead called two <code>rec_isprime</code>, which in
turn called two other, etc, we would get a call graph that was more
like a binary tree than a linked list. This way we would not reach our
maximum recursion depth until we visited 2<sup>512</sup> nodes and we would
never blow the stack!</p>

<p>Here is such a function:</p>

<figure class='code'><figcaption><span>Exponential constexpr recursion - count invocations</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
</pre></td><td class='code'><pre><code class='c++'><span class='line'><span class="k">const</span> <span class="kt">int</span> <span class="n">MAXD</span> <span class="o">=</span> <span class="mi">24</span><span class="p">;</span>
</span><span class='line'>
</span><span class='line'><span class="n">constexpr</span> <span class="kt">int</span> <span class="n">count</span><span class="p">(</span><span class="kt">int</span> <span class="n">n</span><span class="p">,</span> <span class="kt">int</span> <span class="n">depth</span><span class="o">=</span><span class="mi">1</span><span class="p">){</span>
</span><span class='line'>  <span class="k">return</span> <span class="n">depth</span> <span class="o">==</span> <span class="n">MAXD</span> <span class="o">?</span>
</span><span class='line'>           <span class="n">n</span> <span class="o">+</span> <span class="mi">1</span>
</span><span class='line'>         <span class="o">:</span> <span class="n">count</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">depth</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="o">+</span> <span class="n">count</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">depth</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
</span><span class='line'><span class="p">}</span>
</span><span class='line'>
</span><span class='line'><span class="cp">#include&lt;iostream&gt;</span>
</span><span class='line'>
</span><span class='line'><span class="kt">int</span> <span class="n">main</span><span class="p">(){</span>
</span><span class='line'>  <span class="n">constexpr</span> <span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="n">count</span><span class="p">(</span><span class="mi">0</span><span class="p">);</span>
</span><span class='line'>  <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">i</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<p>This program will visit 2<sup>24</sup> - 1 = 16777215 nodes. On my normal
desktop computer with clang 3.1 this program took 27 seconds to
compile and the process uses no almost memory at all. With GCC 4.7.2
it compiles in 0.2 seconds! This is because GCC uses memoization for
both templates and constexprs, so it will instantly see that the other
count call is unneccessary and transform our 16 million node tree to a
linked list of depth 24 again.</p>

<p>That is sort of cheating, so let&#8217;s change the function slightly:</p>

<figure class='code'><figcaption><span>Exponential constexpr recursion - count without memoization</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
</pre></td><td class='code'><pre><code class='c++'><span class='line'><span class="k">const</span> <span class="kt">int</span> <span class="n">MAXD</span> <span class="o">=</span> <span class="mi">24</span><span class="p">;</span>
</span><span class='line'>
</span><span class='line'><span class="n">constexpr</span> <span class="kt">int</span> <span class="n">count</span><span class="p">(</span><span class="kt">int</span> <span class="n">n</span><span class="p">,</span> <span class="kt">int</span> <span class="n">depth</span><span class="o">=</span><span class="mi">1</span><span class="p">){</span>
</span><span class='line'>  <span class="k">return</span> <span class="n">depth</span> <span class="o">==</span> <span class="n">MAXD</span> <span class="o">?</span> <span class="n">n</span> <span class="o">+</span> <span class="mi">1</span><span class="o">:</span> <span class="n">count</span><span class="p">(</span><span class="n">count</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">depth</span> <span class="o">+</span> <span class="mi">1</span><span class="p">),</span> <span class="n">depth</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
</span><span class='line'><span class="p">}</span>
</span><span class='line'>
</span><span class='line'><span class="cp">#include&lt;iostream&gt;</span>
</span><span class='line'>
</span><span class='line'><span class="kt">int</span> <span class="n">main</span><span class="p">(){</span>
</span><span class='line'>  <span class="n">constexpr</span> <span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="n">count</span><span class="p">(</span><span class="mi">0</span><span class="p">);</span>
</span><span class='line'>  <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">i</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<p>Now memoization will not work. Clang takes 26 seconds and still uses
no memory. GCC surprises us again, now taking 44 seconds and using
over 3 gigs of RAM! If we increase MAXD to 25, GCC will again use
twice as much RAM. I don&#8217;t know if this is because of memoization or
something else. I have reported it as a
<a href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55442">bug</a> to GCC.</p>

<p>Still though, it works!</p>

<h2>The trampoline</h2>

<figure class='code'><figcaption><span>Checking primality with a trampoline</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
<span class='line-number'>25</span>
<span class='line-number'>26</span>
<span class='line-number'>27</span>
<span class='line-number'>28</span>
<span class='line-number'>29</span>
<span class='line-number'>30</span>
<span class='line-number'>31</span>
<span class='line-number'>32</span>
<span class='line-number'>33</span>
<span class='line-number'>34</span>
<span class='line-number'>35</span>
<span class='line-number'>36</span>
<span class='line-number'>37</span>
<span class='line-number'>38</span>
<span class='line-number'>39</span>
<span class='line-number'>40</span>
<span class='line-number'>41</span>
<span class='line-number'>42</span>
<span class='line-number'>43</span>
<span class='line-number'>44</span>
<span class='line-number'>45</span>
<span class='line-number'>46</span>
<span class='line-number'>47</span>
<span class='line-number'>48</span>
<span class='line-number'>49</span>
<span class='line-number'>50</span>
<span class='line-number'>51</span>
<span class='line-number'>52</span>
</pre></td><td class='code'><pre><code class='c++'><span class='line'><span class="k">const</span> <span class="kt">int</span> <span class="n">MAXD</span> <span class="o">=</span> <span class="mi">100</span><span class="p">;</span>
</span><span class='line'>
</span><span class='line'><span class="k">template</span><span class="o">&lt;</span><span class="k">typename</span> <span class="n">Fun</span><span class="o">&gt;</span>
</span><span class='line'><span class="n">constexpr</span> <span class="n">Fun</span> <span class="n">trampoline</span><span class="p">(</span><span class="k">const</span> <span class="n">Fun</span> <span class="n">f</span><span class="p">,</span> <span class="kt">int</span> <span class="n">depth</span><span class="o">=</span><span class="mi">0</span><span class="p">){</span>
</span><span class='line'>  <span class="k">return</span>
</span><span class='line'>    <span class="n">f</span><span class="p">.</span><span class="n">done_</span> <span class="o">?</span>
</span><span class='line'>      <span class="n">f</span>
</span><span class='line'>    <span class="o">:</span> <span class="p">(</span><span class="n">depth</span> <span class="o">==</span> <span class="n">MAXD</span> <span class="o">?</span>
</span><span class='line'>         <span class="n">f</span><span class="p">()</span>
</span><span class='line'>       <span class="c1">// If we used mutual recursion here, Fun would not have to check for done_</span>
</span><span class='line'>       <span class="c1">// unfortunately clang segfaults on mutual recursion in a constexpr..</span>
</span><span class='line'>       <span class="o">:</span> <span class="n">trampoline</span><span class="p">(</span><span class="n">trampoline</span><span class="p">(</span><span class="n">f</span><span class="p">(),</span> <span class="n">depth</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)(),</span> <span class="n">depth</span> <span class="o">+</span> <span class="mi">1</span><span class="p">));</span>
</span><span class='line'><span class="p">}</span>
</span><span class='line'>
</span><span class='line'><span class="k">struct</span> <span class="n">Prime</span><span class="p">{</span>
</span><span class='line'>  <span class="k">explicit</span> <span class="n">constexpr</span> <span class="n">Prime</span><span class="p">(</span><span class="kt">unsigned</span> <span class="kt">long</span> <span class="n">n</span><span class="p">,</span> <span class="kt">unsigned</span> <span class="kt">long</span> <span class="n">div</span><span class="p">)</span>
</span><span class='line'>  <span class="o">:</span> <span class="n">n_</span><span class="p">(</span><span class="n">n</span><span class="p">)</span>
</span><span class='line'>  <span class="p">,</span> <span class="n">div_</span><span class="p">(</span><span class="n">div</span><span class="p">)</span>
</span><span class='line'>  <span class="p">,</span> <span class="n">done_</span><span class="p">(</span><span class="n">n</span> <span class="o">%</span> <span class="mi">2</span> <span class="o">==</span> <span class="mi">0</span> <span class="n">or</span> <span class="n">n</span> <span class="o">==</span> <span class="mi">3</span><span class="p">)</span>
</span><span class='line'>  <span class="p">,</span> <span class="n">retval_</span><span class="p">(</span><span class="n">n</span> <span class="o">==</span> <span class="mi">2</span> <span class="n">or</span> <span class="n">n</span> <span class="o">==</span> <span class="mi">3</span><span class="p">)</span>
</span><span class='line'>  <span class="p">{}</span>
</span><span class='line'>
</span><span class='line'>  <span class="k">explicit</span> <span class="n">constexpr</span> <span class="n">Prime</span><span class="p">(</span><span class="kt">bool</span> <span class="n">retval</span><span class="p">)</span>
</span><span class='line'>  <span class="o">:</span> <span class="n">n_</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
</span><span class='line'>  <span class="p">,</span> <span class="n">div_</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
</span><span class='line'>  <span class="p">,</span> <span class="n">done_</span><span class="p">(</span><span class="kc">true</span><span class="p">)</span>
</span><span class='line'>  <span class="p">,</span> <span class="n">retval_</span><span class="p">(</span><span class="n">retval</span><span class="p">)</span>
</span><span class='line'>  <span class="p">{}</span>
</span><span class='line'>
</span><span class='line'>  <span class="c1">//CPS! Prime does not recurse, but instead returns a continuation</span>
</span><span class='line'>  <span class="n">constexpr</span> <span class="n">Prime</span> <span class="k">operator</span><span class="p">()(){</span>
</span><span class='line'>    <span class="k">return</span> <span class="p">(</span><span class="n">done_</span> <span class="o">?</span> <span class="o">*</span><span class="k">this</span>
</span><span class='line'>            <span class="o">:</span><span class="p">(</span><span class="n">n_</span> <span class="o">%</span> <span class="n">div_</span> <span class="o">==</span> <span class="mi">0</span> <span class="o">?</span>
</span><span class='line'>              <span class="n">Prime</span><span class="p">(</span><span class="kc">false</span><span class="p">)</span>
</span><span class='line'>              <span class="o">:</span> <span class="p">(</span><span class="n">div_</span> <span class="o">*</span> <span class="n">div_</span> <span class="o">&gt;</span> <span class="n">n_</span> <span class="o">?</span>
</span><span class='line'>                 <span class="n">Prime</span><span class="p">(</span><span class="kc">true</span><span class="p">)</span>
</span><span class='line'>                 <span class="o">:</span> <span class="n">Prime</span><span class="p">(</span><span class="n">n_</span><span class="p">,</span> <span class="n">div_</span> <span class="o">+</span> <span class="mi">2</span><span class="p">))));</span>
</span><span class='line'>  <span class="p">}</span>
</span><span class='line'>
</span><span class='line'>  <span class="kt">unsigned</span> <span class="kt">long</span> <span class="n">n_</span><span class="p">;</span>
</span><span class='line'>  <span class="kt">unsigned</span> <span class="kt">long</span> <span class="n">div_</span><span class="p">;</span>
</span><span class='line'>  <span class="kt">bool</span> <span class="n">done_</span><span class="p">;</span>
</span><span class='line'>  <span class="kt">bool</span> <span class="n">retval_</span><span class="p">;</span>
</span><span class='line'><span class="p">};</span>
</span><span class='line'>
</span><span class='line'><span class="cp">#include&lt;iostream&gt;</span>
</span><span class='line'>
</span><span class='line'><span class="kt">int</span> <span class="n">main</span><span class="p">(){</span>
</span><span class='line'>  <span class="n">constexpr</span> <span class="kt">bool</span> <span class="n">p1</span> <span class="o">=</span> <span class="n">trampoline</span><span class="p">(</span><span class="n">Prime</span><span class="p">((</span><span class="mi">1L</span> <span class="o">&lt;&lt;</span> <span class="mi">31</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">)).</span><span class="n">retval_</span><span class="p">;</span>
</span><span class='line'>  <span class="n">constexpr</span> <span class="kt">bool</span> <span class="n">p2</span> <span class="o">=</span> <span class="n">trampoline</span><span class="p">(</span><span class="n">Prime</span><span class="p">((</span><span class="mi">1L</span> <span class="o">&lt;&lt;</span> <span class="mi">32</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">)).</span><span class="n">retval_</span><span class="p">;</span>
</span><span class='line'>  <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">p1</span> <span class="o">&lt;&lt;</span> <span class="s">&quot; &quot;</span> <span class="o">&lt;&lt;</span> <span class="n">p2</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<p>As you can see, we are allowed to create objects with constexpr
constructors in a constexpr, so we can create containers like linked
lists and trees. You can create maps, (simulated) vectors and sets on
top of trees, which means that anything is possible. In truth, prime
number calculation is not very exciting, but this post is running long
as it is, so a compile-time CPS implementation of Game of Life or
something will have to wait for next time.</p>

<p>Now, imagine the cheating possibilities in <a href="http://shootout.alioth.debian.org/">The Computer Language Benchmarks Game</a> ;). Yes, yes, I
know it can be done in some other compiled languages as well. Template
Haskell and the Lisps come to mind.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[C++11 and Boost - succinct like Python]]></title>
    <link href="http://gurgeh.github.com/blog/2012/10/31/c-plus-plus-11-and-boost-succinct-like-python/"/>
    <updated>2012-10-31T13:05:00+01:00</updated>
    <id>http://gurgeh.github.com/blog/2012/10/31/c-plus-plus-11-and-boost-succinct-like-python</id>
    <content type="html"><![CDATA[<p>C++11 is the new standard of C++ that was released last year. Yes, I know that is now 2012, but compilers are just now starting to catch up and implement everything, though AFAIK there is not yet a fully compliant compiler.</p>

<p>With a combination of C++11 and the <a href="http://www.boost.org/">Boost</a> library, I think that it is possible to write code in a style that is almost as painless as in a modern dynamic language like Python. I also think that is not so well known how much C++ has changed for the better, outside the C++-community. Hence this post.</p>

<p>As an example, I have taken the first interesting excercise from <a href="http://www.diveintopython.net/toc/index.html">Dive into Python</a>, <a href="http://www.diveintopython.net/object_oriented_framework/index.html">fileinfo.py</a>, and converted it to C++, trying to remain as faithful as possible to the original code.</p>

<!--more-->


<p><code>fileinfo.py</code> implements a simple MP3-metadata printer, which is built so that it will be easy to add other metadata printers in the future. For the Python code, see the previous link and for my commented C++ code see here:</p>

<div><script src='https://gist.github.com/3871749.js?file='></script>
<noscript><pre><code>//Having to include so many different header files to do basic things like open a file, use strings, vectors and tuples, etc, is still annoying.
#include &lt;fstream&gt;
#include &lt;vector&gt;
#include &lt;map&gt;
#include &lt;tuple&gt;
#include &lt;string&gt;

//To use C++11 lambdas with Boost lambdas we define this. Warning, bugs in Boost &lt; 1.51 and with compilers that are not standard compliant. Gcc 4.7+ should work. In Boost 1.52 and later we may not need the define at all.
#define BOOST_RESULT_OF_USE_DECLTYPE

#include &lt;boost/format.hpp&gt; // String formating
#include &lt;boost/algorithm/string.hpp&gt; // Join list of strings
#include &lt;boost/filesystem.hpp&gt; // Iterate over files and directories
#include &lt;boost/range/adaptors.hpp&gt; // Adaptors!
#include &lt;boost/range/algorithm/copy.hpp&gt; // We just need one algorithm

//Like from ... import * in Python. Beware namespace collisions. Don't open namespaces directly in a header file.
using namespace std;
using namespace boost::filesystem;
using namespace boost::adaptors;

typedef map&lt;string, string&gt; propMap; // Modelling mp3 metadata String Key/Val
typedef map&lt;path, function&lt;propMap(string)&gt;&gt; extLookup; // Extension to FileInfo functor

const int kTailSize = 128; // Size of Mp3 metadata section. Located at the end of the file.

string stripnulls(string s){
  using namespace boost::algorithm; // Example of local &quot;using&quot;
  erase_all(s, &quot;\0&quot;); //Remove the null bytes
  trim(s); //Trim whitespace in both ends of the string
  return s;
}

string ord(string s){
  int i = static_cast&lt;unsigned char&gt;(s[0]); //Chars are signed by default, so we need a cast to treat them as unsigned, like the Python code
  return (boost::format(&quot;%1%&quot;) % i).str(); //Boost format is a type safe version of sprintf, which also avoids the problem with preallocating a buffer
}

//In C++11, passing functions around is much easier. Everything that behaves like a function from string to string can be stored in a function&lt;string(string)&gt;, including functors (objects with overloaded call semantics)
typedef function&lt;string(string)&gt; StrToStr;

//In the olden days we could not initialize a map inline like this
//Key -&gt; metadata mapping. Unfortunately tuples cannot be {}-initialized nested.
const map&lt;const string, const tuple&lt;int, int, StrToStr&gt;&gt; TagDataMap {
  {&quot;title&quot;   , make_tuple( 3,   30, stripnulls)},
  {&quot;artist&quot;  , make_tuple( 33,  30, stripnulls)},
  {&quot;album&quot;   , make_tuple( 63,  30, stripnulls)},
  {&quot;year&quot;    , make_tuple( 93,   4, stripnulls)},
  {&quot;comment&quot; , make_tuple( 97,  29, stripnulls)},
  {&quot;genre&quot;   , make_tuple(127,   1, ord)}};

propMap Mp3FileInfo(string p){
  propMap ret {{&quot;name&quot;, p}};

  ifstream f(p, ios::binary);
  if(f.fail()) return ret;

  string sbuf;
  sbuf.resize(kTailSize);
  f.seekg(-kTailSize, ios::end);
  f.read(&amp;sbuf[0], kTailSize);

  if(sbuf.substr(0,3) != &quot;TAG&quot;) return ret;

  int start, length;
  StrToStr mapfun;
  //for loops over collections are finally convenient to use.
  for(auto td : TagDataMap){
    tie (start, length, mapfun) = td.second; // &quot;tie&quot; is tuple deconstruction and assignment, just like in Python
    ret[td.first] = mapfun(sbuf.substr(start, length));
  }

  return ret;
}

vector&lt;propMap&gt; listDirectory(string directory, extLookup exts){
  directory_iterator startd(directory), endd;
  auto files = make_iterator_range(startd, endd);
  vector&lt;propMap&gt; retmap;
  
  for(path p : files){
    auto x = exts.find(p.extension());
    if(x != exts.end()){
      retmap.push_back(x-&gt;second(p.string()));
    }
  }
  /*
  I actually think that for loops are too general, and should be used as seldom as possible. Range algorithms like filter and transform tell the reader (and compiler) exactly what I am doing to my data and leaves less room for bugs and misinterpretation. I know this is heresy to many C++-ers, though.
The following is how the above for loop would look in a functional list-comprehension style, featuring Boost's range algorithms and the new C++11 lambda expression.

  boost::copy(files | filtered([exts](path p){return exts.count(p.extension());}) 
                    | transformed([exts](path p){return exts.find(p.extension())-&gt;second(p.string());}),
                    back_inserter(retmap));

I decided not to go with this anyway, since in this case the original for loop is so succinct.
*/

  return retmap;
}

int main(int argc, char* argv[]){
  //A map from file extension to function, replacing the brittle Python introspection method
  extLookup exts = {{&quot;.mp3&quot;, Mp3FileInfo}};
  //Get a property map for each file in the given directory
  for(propMap pm: listDirectory(argv[1], exts)){
    //&quot;join&quot; is like join in Python and &quot;pm | transformed&quot; pipes the property map through a mapping function that makes strings from the properties.
    cout &lt;&lt; join(pm | transformed([](propMap::value_type pv){
          return (boost::format(&quot;%1%=%2%&quot;) % pv.first % pv.second).str();
        }), &quot;\n&quot;);
    cout &lt;&lt; &quot;\n\n&quot;;
  }
}
</code></pre></noscript></div>


<p>If you remove my gratuitous commenting and the includes, you will notice that it is more or less as succinct as the Python code.</p>

<p>Perhaps more importantly, it is also as safe. No pointers to point at bad places and leak memory and no buffers to overflow. There is much more to why modern C++ is more safe and less memory leaky than it once was, besides the added emphasis on a more functional style, for example raw pointers are actively discouraged (use smart pointers instead) and arrays (not vectors!) are now finally in the standard library, so we use array&lt;int, 10> instead of int array[10].</p>

<p>This program is actually the basis of a presentation I gave at work, which had more explanations on the features used. It only touches the surface on what you can do with C++11 You can read more about the new features in C++11 in <a href="http://blog.smartbear.com/software-quality/bid/167271/The-Biggest-Changes-in-C-11-and-Why-You-Should-Care">the biggest changes in C++11 and why you should care</a> and of course on <a href="http://en.wikipedia.org/wiki/C%2B%2B11">Wikipedia</a>.</p>

<p>Herb Sutter, who is chair of the ISO C++ standards committee and thus perhaps not completely unbiased, had this to say on the relase of C++11:</p>

<blockquote><p>Now with C++11&#8217;s improvements that incorporate many of the best features of managed languages, modern C++ code is as clean and safe as code written other modern languages, as well as fast, with performance by default and full access to the underlying system whenever you need it.</p></blockquote>

<p>I think it is almost there. For example, the language is still too large, so in a new project you have to have a good code standard, that decides well on which parts should be used and how. The standard library, even including Boost, is still a bit weak in some areas, like downloading an URL. Also the lack of a proper package manager, like pip/PyPi, hurts.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[No, really. Use Zsh.]]></title>
    <link href="http://gurgeh.github.com/blog/2012/09/28/no/"/>
    <updated>2012-09-28T13:23:00+02:00</updated>
    <id>http://gurgeh.github.com/blog/2012/09/28/no</id>
    <content type="html"><![CDATA[<p><a href="http://zsh.sourceforge.net/">Zsh</a> is the new hotness. Well newer and hotter than Bash anyway, since the first version of Bash was released in June 1989, while the young and peppy Zsh was released in December 1990. In large parts thanks to the configuration &#8220;skin&#8221; <a href="https://github.com/robbyrussell/oh-my-zsh">oh-my-zsh</a>, Zsh has gained a lot of popularity during the last year or so. I have used it for a few months myself and could not be happier, unless it produced chocolate ice cream ← <em>note to shell developers</em>.</p>

<p>This is a guide on why you need it and how you install, configure and use it. Sometimes just with links to the relevant sites.</p>

<!--more-->


<p>This is written partly for my colleagues, who I think would benefit from using Zsh instead of Bash on their desktops and on our servers.</p>

<h2>So what makes Zsh so great?</h2>

<ol>
<li>Powerful context based tab completion</li>
<li>Pattern matching/globbing on alien steroids</li>
<li>Themeable prompts</li>
<li>Loadable modules</li>
<li>Good spelling correction</li>
<li>Sharing of command history among all running shells (I like my command line history and all my Konsole tabs)</li>
<li>Global aliases</li>
</ol>


<p>Though I have used it for 10+ years, I am far from a Bash specialist. Some of these things can be enabled in Bash, but AFAIK it is not implemented quite as well.</p>

<h3>Context based tab completion</h3>

<p>File based tab completion is great and all, but zsh has tab completion for <em>everything</em>. It has knowledge about an impressive number of tools and scripts. It knows which commands <em>git</em> takes, which hosts are in my hosts file for <em>ssh</em>, which users my system have when I write <em>chmod</em>, available packages to <em>apt-get</em>, etc. Using [tab] when writing commands is a bit like static type checking, since if you don&#8217;t get a completion you are probably writing your argument type in the wrong place.</p>

<p>For common use cases it is easy to write tab completion specifications for your own scripts.</p>

<h3>Globbing</h3>

<p>Globbing means command line parameter expansion. For example <code>ls *.html</code>. Zsh has it&#8217;s own <em>globbing language</em>. You can sort and filter by exclusion or inclusion on name, size, permission, owner, creation time. Everything.</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>ls *(.)            # list just regular files
</span><span class='line'>ls *(/)            # list just directories
</span><span class='line'>ls -ld *(/om[1,3]) # Show three newest directories. "om" orders by modification. "[1,3]" works like Python slice.
</span><span class='line'>rm -i *(.L0)       # Remove zero length files, prompt for each file
</span><span class='line'>ls *(^m0)          # Files not modified today.
</span><span class='line'>emacs **/main.py   # Edit main.py, wherever it is in this directory tree. ** is great.
</span><span class='line'>ls **/*(.x)        # List all executable files in this tree
</span><span class='line'>ls *~*.*(.)        # List all files that does not have a dot in the filename
</span><span class='line'>ls -l */**(Lk+100) # List all files larger than 100kb in this tree
</span><span class='line'>ls DATA_[0-9](#c4,7).csv # List DATA_nnnn.csv to DATA_nnnnnnn.csv</span></code></pre></td></tr></table></div></figure>


<p>These examples happily borrowed from <a href="http://www.rayninfo.co.uk/tips/zshtips.html">Zzappers Best of ZSH Tips</a>, <a href="http://www.tuxradar.com/content/z-shell-made-easy">Z shell made easy</a> and <a href="http://grml.org/zsh/zsh-lovers.html">Zsh-lovers man page</a>. Skim through them all, when you have decided to give Zsh a try.</p>

<h3>Loadable modules</h3>

<p>Loadable modules are modules that give your shell additional functionality. Sort of like importing a library when you code. They can make the filters above even more interesting. For example expressing date constraints in a natural format. There are examples of using modules in the Zsh-lovers man page and full documentation in the <a href="http://www.math.technion.ac.il/Site/computing/docs/zsh/zsh_21.html">Zsh Modules Documentation</a>.</p>

<h3>Good spelling correction</h3>

<p>Zsh does not care if I write a filename in lowercase or mixed or whatever. When I try [tab] it will first try to complete on the exact match and then use a case insensitive match. Great! It also has spelling correction built-in in other places, suggesting which command you might have meant, etc. You don&#8217;t want full spelling correction on files though (deactivated per default). That is just annoying.</p>

<h3>Global aliases</h3>

<p>Aliases are nice, but global aliases are words that can be used anywhere on the command line, thus you can give certain files, filters or pipes a short name. Some examples:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>alias -g L="|less" # Write L after a command to page through the output.
</span><span class='line'>alias -g TL='| tail -20'
</span><span class='line'>alias -g NUL="&gt; /dev/null 2&gt;&1" # You get the idea.</span></code></pre></td></tr></table></div></figure>


<p>If you want to give a directory an alias, you use <em>hash</em>. <code>hash -d projs=~/projects/</code></p>

<p>Zsh also has suffix aliases, which means that you can tie a file suffix, let&#8217;s say &#8220;pdf&#8221; to a command, for example xpdf. <code>alias -s pdf=xpdf</code> Now if you just type the name of a pdf file, it will be displayed with xpdf. Similar to suffix aliases, if you turn on AUTO_CD, typing the name of a directory cd:s to it.</p>

<h3>Installing Zsh</h3>

<p>OK, so now you know that you need to try Zsh out. What do you do? Simple. Head over to <a href="https://github.com/robbyrussell/oh-my-zsh">oh-my-zsh</a> and follow the intructions. When installing Zsh, you should customize two things.</p>

<h4>a) The prompt</h4>

<p>Extravagant prompts have become a badge of pride for Zshellers. It is quite easy to write your own prompt. I have:</p>

<p><img src="http://gurgeh.github.com/images/zsh-prompt.png" alt="My Zsh prompt" />
Note how you can control both the left part and right part of the line. You can also make multi-line prompts, but I like to conserve screen estate.</p>

<p>On the left side I put stuff that are of interest almost all the time. My username, the name of the computer, the last two parts of the directory tree, If I am in a git or hg repository and which branch (very useful to keep from screwing up) and finally if there are uncommitted code (the X turns in to a neat V when there is nothing to commit).</p>

<p>On the right side, I put the number of jobs running from this terminal, free memory (good reminder on servers) and average CPU load during the last 5 minutes, as reported by <code>uptime</code>. CPU load is expressed as number of CPUs utilized, so to make it meaningful, I also put the number of CPUs in paranthesis after the uptime value. This is also mostly useful if you have several servers with different number of CPUs. If the uptime value exceeds number of CPUs, you know that you have problems. Finally I put a timestamp. Perhaps you think &#8220;How silly, I have a clock on my desktop!&#8221;, but the point is not to tell the current time, but that I can scroll back and see when I started a command and how long it took. It keeps a log. Another good thing with keeping colorful stuff on the right side of the prompt is that it makes it easy to scroll through your terminal output and find commands, when your last compile spewed out 200 lines.</p>

<p>Usually the window is much wider than in my screenshot, so you don&#8217;t often run into the right part of the prompt, but if you do, Zsh will magically remove it before you reach it, so it won&#8217;t look cluttered.</p>

<p>This is the code for my prompt. For some reason Github overrides my syntax highlighting settings when I give the gist a filename, so my gists are unnamed today:</p>

<div><script src='https://gist.github.com/3800072.js?file='></script>
<noscript><pre><code>function prompt_char {
    git branch &gt;/dev/null 2&gt;/dev/null &amp;&amp; echo '±' &amp;&amp; return
    hg root &gt;/dev/null 2&gt;/dev/null &amp;&amp; echo '☿' &amp;&amp; return
    echo '%(!.!.➜)'
}


function parse_hg_dirty {
  if [[ -n $(hg status -mard . 2&gt; /dev/null) ]]; then
    echo &quot;$ZSH_THEME_HG_PROMPT_DIRTY&quot;
  fi
}

function get_RAM {
  free -m | awk '{if (NR==3) print $4}' | xargs -i echo 'scale=1;{}/1000' | bc
}

function get_nr_jobs() {
  jobs | wc -l
}

function get_nr_CPUs() {
  grep -c &quot;^processor&quot; /proc/cpuinfo
}

function get_load() {
  uptime | awk '{print $11}' | tr ',' ' '
}

PROMPT='%{$fg_bold[green]%}%n@%m %{$fg[cyan]%}%2c %{$fg_bold[blue]%}$(git_prompt_info)$(parse_hg_dirty)%{$fg_bold[blue]%} %{$fg_bold[red]%}$(prompt_char) % %{$reset_color%}'

RPROMPT='%{$fg_bold[red]%}[$(get_nr_jobs), $(get_RAM)G, $(get_load)($(get_nr_CPUs))] %{$fg_bold[green]%}%*%{$reset_color%}'

ZSH_THEME_HG_PROMPT_PREFIX=&quot;hg:(%{$fg[red]%}&quot;
ZSH_THEME_GIT_PROMPT_PREFIX=&quot;git:(%{$fg[red]%}&quot;
ZSH_THEME_GIT_PROMPT_SUFFIX=&quot;%{$reset_color%}&quot;
ZSH_THEME_GIT_PROMPT_DIRTY=&quot;%{$fg[blue]%}) %{$fg[yellow]%}✗%{$reset_color%}&quot;
ZSH_THEME_HG_PROMPT_DIRTY=&quot;%{$fg[yellow]%}✗%{$reset_color%}&quot;
ZSH_THEME_GIT_PROMPT_CLEAN=&quot;%{$fg[blue]%})&quot;
</code></pre></noscript></div>


<p>You can take it or any other from the <a href="https://github.com/robbyrussell/oh-my-zsh/wiki/themes">theme gallery</a> (you should visit that now, it is full of themes and colorful screenshots of what you can use immediately after installing oh-my-zsh) as a base if you want to write your own. You can call all the scripts and commands that your heart desire to produce the prompt output.</p>

<p>Currently, my prompt is not part of the oh-my-zsh distribution (I have not made a pull request. Perhaps I should?), so if you want to use it, just download it and copy it to ~/.oh-my-zsh/themes/gurgeh.zsh-theme.</p>

<p>A good prompt really is a great help.</p>

<h4>And b) the plugins</h4>

<p>Oh-my-Zsh has a lot of great plugins. You can read more about them there. I found it easiest to just read through the simple source code of the ones that interested me. These are the ones that I use (from my  ~/.zshrc file):</p>

<p><code>plugins=(git mercurial autojump command-not-found python pip github gnu-utils history-substring-search)</code></p>

<p>To use autojump on Ubuntu, you need to <code>sudo apt-get install autojump</code>. That enables you to just write <code>j [directory]</code> and it will jump to the most frequently used directory with that name. A great way to navigate, which naturally can use tab-completion. It just takes some time for your fingers to get used to it. Autojump is not limited to Zsh, of course.</p>

<h3>My configuration</h3>

<p>My Zsh configuration is nothing special, you can check out part of it here:</p>

<div><script src='https://gist.github.com/3800232.js?file='></script>
<noscript><pre><code>
# Path to your oh-my-zsh configuration.
ZSH=$HOME/.oh-my-zsh

# Set name of the theme to load.
# Look in ~/.oh-my-zsh/themes/
# Optionally, if you set this to &quot;random&quot;, it'll load a random theme each
# time that oh-my-zsh is loaded.
ZSH_THEME=&quot;gurgeh&quot;

# I don't like case sensitive completion
# CASE_SENSITIVE=&quot;true&quot;

# Comment this out to disable weekly auto-update checks. Oh-my-zsh updates itself.
DISABLE_AUTO_UPDATE=&quot;true&quot;

# Uncomment following line if you want to disable colors in ls
# DISABLE_LS_COLORS=&quot;true&quot;

# Uncomment following line if you want to disable autosetting terminal title.
# DISABLE_AUTO_TITLE=&quot;true&quot;

# Uncomment following line if you want red dots to be displayed while waiting for completion
COMPLETION_WAITING_DOTS=&quot;true&quot;

# Which plugins would you like to load? (plugins can be found in ~/.oh-my-zsh/plugins/*)
# Custom plugins may be added to ~/.oh-my-zsh/custom/plugins/
# Example format: plugins=(rails git textmate ruby lighthouse)
plugins=(git mercurial autojump command-not-found python pip github gnu-utils history-substring-search)

source $ZSH/oh-my-zsh.sh

# Customize to your needs...

#I don't like this feature. I think no one does. It corrects you, when you are trying to create new files, for example.
unsetopt correctall 

export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
export EDITOR=emacs

# And some other sutff here...
</code></pre></noscript></div>


<p>I probably should add some more aliases.</p>

<h3>Miscellaneous tips and tricks</h3>

<p>In Bash, we often use <em>PgUp</em> to search through the command history. In Zsh you just write part of the command and press <em>Up</em>. This will let you cycle through all command lines that contain what you have written, not just those that begins with it. If you don&#8217;t write anything <em>Up</em> works as usual, by cycling through all commands.</p>

<p>Write <em>zsh_stat</em> to get statistics over which commands you use the most. Use <em>jumpstat</em> to get statistics over which paths you use the most.</p>

<p>Well, that&#8217;s it for now. Make the switch now! And browse the resources I have linked to for more tips.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Octopress and Github as a blogging platform]]></title>
    <link href="http://gurgeh.github.com/blog/2012/09/27/octopress-and-github-as-blogging-platform/"/>
    <updated>2012-09-27T16:37:00+02:00</updated>
    <id>http://gurgeh.github.com/blog/2012/09/27/octopress-and-github-as-blogging-platform</id>
    <content type="html"><![CDATA[<p>I have switched from Blogspot to <a href="http://octopress.org">Octopress</a>. Any self respecting coder should realize that 1) blogging is text and 2) text should be in revision control. Also 3) blogging is public text, so the revision control can be on a public server, like Github.</p>

<!--more-->


<p>It was fairly easy, just follow the instructions on Octopress. If you are running Ubuntu, don&#8217;t <code>apt-get rbenv</code>, use the latest version from Github instead. This is actually what the Octopress instructions tells you to do, but I ignored it and I am now happy to have survived.</p>

<p>The Octopress instructions for deploying to Github is also straightforward enough, as are the Github instructions for using your own domain name.</p>

<p>I used an <a href="https://gist.github.com/1765496">external script</a> to migrate my Blogspot articles to my new site. It worked OK, but formatting was lost and the comments became static text. If you browse them, you will see that they currently look kinda ugly. When I have fixed them up a bit, I will begin forwarding all my old posts from Blogspot to fendrich.se.</p>

<p>There are not an abundance of themes yet, but you can always design one yourself. Picking colors and doing basic layout is easy. If you are design impaired, like I, there are a few ready-made <a href="https://github.com/imathis/octopress/wiki/3rd-Party-Octopress-Themes">here</a>. Currently I use Darkstripes.</p>

<p>Also, a good blog has comments (and a good blog reader comments!), so I enabled <a href="http://disqus.com">Disqus</a>.</p>

<p>Google Analytics support is built in, but just to be hip, I use <a href="http://getclicky.com">Clicky</a> instead. It was simple to install. Just put the tracking code in <code>/source/_includes/custom/after_footer.html</code>.</p>

<p>I created my own favicon from my Twitter picture by using this <a href="http://www.favicon.cc/">tool</a> and stuck it in the <em>/source</em> directory. Possible because I am using Darkstripes I had to edit <code>/source/_includes/head.html</code> and change <em>favico.png</em> to <em>favico.ico</em>.</p>

<p>That&#8217;s basically it. Hopefully this means I blog more. And better. And that the next post contains code. And perhaps even a pretty image. Or graph. And full sentences.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Guerilla - my attempt to build a strong AI]]></title>
    <link href="http://gurgeh.github.com/blog/2011/11/10/guerilla-my-attempt-to-build-strong-ai/"/>
    <updated>2011-11-10T00:00:00+01:00</updated>
    <id>http://gurgeh.github.com/blog/2011/11/10/guerilla-my-attempt-to-build-strong-ai</id>
    <content type="html"><![CDATA[<div class='post'>
Having given the topic of artificial general intelligence (or AGI) a fair amount of thought over the last decade, a plausible path forward, based on something like an encyclopedia for algorithms, has slowly congealed in my head. I need to write it down before I start coding. Preferably in the form of a blog post, to get feedback. This is that post.<br /><br /><span class="Apple-style-span" style="font-size: large;">Background</span><br /><br />When I write about strong AI or AGI, I mean an algorithm with general problem solving skills. Not necessarily a mind inhabiting a robot, running around talking to people and passing the Turing test (though eventually a successful AGI could be taken in that direction), but rather something that can be applied to a wide variety of problems. For example: playing chess and poker, picking stocks, solving puzzles, proving math theorems, analyzing and writing computer code, speech and image recognition, improve itself by learning and self modification, etc.<br /><br /><br />There is an alluring approach to AGI, where you begin with a &#8220;simple&#8221; seed program, which will learn and self improve and eventually evolve to human intelligence and beyond. I think that in practice, the problem is that it has taken lots of people lots of time to invent all those algorithms that can be useful in general problem solving.&nbsp;Humans are pretty good at general problem solving - certainly much better than the best software/hardware combination we have today. Constructing algorithms to solve specific problems is in itself a kind of problem solving, and we humans have certainly invented many different algorithms for a wide variety of purposes. One might suspect that&nbsp;the computational depth of inventing/discovering algorithms is very large.&nbsp;So unless the seed program is actually very advanced, more like a full grown Sequoia than a seed, it might take too much time for it to invent all those algorithms and heuristics that we, as a civilization, already have.<br /><br /><br />Topics that are potentially useful to understand include:<br /><br /><br /><ul><li>Statistics (Bayes rule, distributions, Markov chains, running an experiment, etc).</li><li>Algorithms for optimizing parameters (genetic algorithms, simulated annealing, steepest descent, linear programming, random testing and purely analytical methods). In some situations it could take days or more to try a single parameter configuration. In other situations evaluating the fitness of a parameter configuration is just a couple of CPU instructions. Approaching these different tasks require a variety of methods</li><li>Logic</li><li>Basic mathematics (calculus, algebra, geometry, etc)</li><li>Code analysis (lambda calculus, etc)</li><li>Formal proof methods (knowledge of the methods listed here: http://en.wikipedia.org/wiki/Mathematical_proof) and formal reasoning</li><li>Tree and graph searching (depth-first, breadth-first, A*, beam, minimax, alpha-beta, Dijkstra)</li><li>Bayesian belief networks</li><li>Pattern recognition</li><li>Compression</li><li>Monte Carlo method</li><li>Clustering and classification</li><li>Fourier transforms, wavelets</li><li>Function approximation (analytical or with neural networks or genetic programming)</li><li>Inverting functions (in other words, given a program function and its output tell me what the input was - this turn out to be a very general way of posing questions)</li></ul><br /><div>&#8230;and of course many more.</div><div><br /></div><div>I consider statistics and parameter optimization to be the most important areas for intelligence, since you need them to learn. Pattern recognition (perhaps implemented with statistics and optimization) and various forms of tree searching are also vital.</div><br /><span class="Apple-style-span" style="font-size: large;">An encyclopedia of algorithms</span><br /><br />My approach is based on implementing an encyclopedia of useful algorithms that:<br /><br /><ol><li>Know to which tasks they can be applied</li><li>Can give a rough, initially often ridiculously rough, estimate of what the probability is that they solve the task after a certain time or, in the case of an open-ended task such as optimization, can give a rough estimate of how well the task is solved after a certain time.</li><li>Can continuously update the estimate as the task is solved</li></ol><br />It is important to stress that it is not enough to just implement a library of algorithms that can work on the same datastructures. The important thing is that you need metadata, describing when an algorithm can be used and the algorithmic complexity in time and memory. With time, you want to automatically build up more knowledge of the algorithms, gradually improving the time and success estimations as well as improving your knowledge of which algorithms are suitable in which situations.<br /><br />The algorithms should be broken up into as many natural subtasks as possible, so that when new algorithms are added to the system, they can try to solve these subtasks as well, thus creating new hybrid algorithms.<br /><br />A <i>task</i> is basically a function call together with its arguments. An algorithm that can solve a task implements the corresponding function and a time/success estimator. Similarly to function overloading in C++, the function header might state specializations, additional properties, of the function arguments that must be true for the algorithm to be a contender to solve it. It is important that the <i>Scheduler</i> (see below) immediately knows which algorithms are suitable for a certain task, so the mentioned &#8220;additional argument properties&#8221; must be immediately available. If a certain property requires work to find out - &#8220;is the list in the first argument sorted?&#8221; - and an algorithm still needs it, a new algorithm can be constructed that first checks if the list is sorted and then either fails or asks the task again with the new property set. This new algorithm would have higher estimates of running time and lower estimates of success than the original algorithm.<br /><br /><span class="Apple-style-span" style="font-size: large;">The Scheduler</span><br /><br />When a task is added to the task pool it always has a <i>price</i> attached to it. The Scheduler runs those algorithms that currently promise best expected price per time unit. Algorithms that need subtasks solved has to assign a price to those too, before adding them to the task pool. That price should reasonably reflect how much of the overall time the subtask is expected to take. If it turns out that a subtask consistently take a smaller or larger fraction of the estimated total time, there should be algorithms that modify the price for these subtasks and correspondingly the total time estimate (also, see <i>Self Improvement</i> below).<br /><br />Open-ended tasks where something should be optimized cannot have just one value attached to them. Instead they need to have a function from achieved performance to price, or at least a rough mapping from some performance values to price. This mapping stops the system from optimizing for too long on a relatively unimportant subsubsubtask somewhere.<br /><br />Algorithms that can either fail or succeed on a task need a similar mapping, where they give probability of success as a function of time.<br /><br />One can also imagine that the algorithms could give a confidence interval or standard deviation on their estimates to tell the Scheduler how sure they are of their estimates, but I am not quite sure how this should be used, so for now they won&#8217;t.<br /><br />For my first try, the Scheduler will use a simple heuristic. The algorithm that claims to have the best price / time ratio for any task currently in the task pool will get to run it. For one thread this will be optimal in some sense. It gets more complex when you have many algorithms running in parallel on multiple cores or even clusters. For example, you want to slightly punish two algorithms trying to complete the same task in parallel, since the first one to succeed will always make the other algorithms work moot. On the other hand sometimes it makes sense to attack an important problem from several angles, so you don&#8217;t want to forbid it entirely either.<br /><br />In a later design, the Scheduler should be able to use the task pool to think about how it should Schedule. Obviously this must not end in an infinite Scheduling loop or general inefficiency, since normally the Scheduler must work very quickly.<br /><br />The Scheduler&#8217;s work and indeed that of the whole system will not be especially interesting when there are only a few algorithms implemented. The first interesting moment will be when new hybrid algorithms emerge, where subtasks are sometimes handled by unanticipated algorithms. I am not sure how many algorithms needs to be implemented for the system to show interesting emergent behaviour. Probably more than ten, but less than a hundred, depending, of course, on which algorithms and what you count as an individual algorithm.<br /><br /><span class="Apple-style-span" style="font-size: large;">Self improvement</span><br /><br />From the above, you can see that the system will not be self improving at first. However, by adding self improvement tasks, it will start doing things like improving the time estimates of the algorithms, learn to what degree one algorithm&#8217;s failure to solve a task should also reflect on the estimates of other algorithms, learn which situations are suitable for which algorithms; for example which algorithms perform well on the subtasks posted from a certain algorithm. It can also have an algorithm that constructs new lower-priced training tasks from real tasks, for example generalizations or specializations of a problem, just out of &#8220;curiosity&#8221;.<br /><br />Producing new/improved code and algorithms, either for self improvement or as the solution for a puzzle or some other task, is among the most advanced tasks the system can try. It will not be able to do much of interest in this area until it is really strong, but it could start out by trying simple modifications of existing algorithms or trying them on similar tasks, a bit like in genetic programming.<br /><br />The system is also inherently self improving from a sort of network effect, since for each algorithm added, the existing algorithms get potentially better.<br /><br /><span class="Apple-style-span" style="font-size: large;">What now?</span><br /><br />When I have implemented the base system, I will start by applying the AGI to function inversion. Trivial stuff at first, of course, but I hope to eventually make it solve real puzzles like Towers of Hanoi by a combination of searching and deduction. Also, it would be fun to try some games and an NP-complete problem like 3-SAT.<br /><i><br /></i><br />It would be beautiful if the algorithms were written in the same simplified, purely functional (thus easier to analyze), LISP that I plan to write the problem definitions in. Alas, good AI needs to be fast and a 100x slower system just because the algorithms run in my own immature poorly interpreted language instead of C is not so fun. However, a good JIT compiler is a very good test for an AGI. You continuously have to weigh optimization time against running what you have. If the AGI in some distant future JITs its own code, effectively running and optimizing itself, I will consider the entire project a grand success :).<br /><br />I forget why I called the project Guerilla. It was probably terribly clever. Nevertheless, here is the link to the Github repository:&nbsp;<a href="https://github.com/gurgeh/Guerilla">https://github.com/gurgeh/Guerilla</a>. It does not contain much yet.<br />  <div class="zemanta-pixie" style="height: 15px; margin-top: 10px;"></div></div>
<h2>Comments</h2>
<div class='comments'>
<div class='comment'>
<div class='author'>sm4096</div>
<div class='content'>
Ai building efforts start at definitions:  Ai that can<br />specify goals and weight them , acquire combine breakdown and refine strategy.<br /><br />A strategy specifies goals, their desirability and at what likelihoods to take what actions on what (set of) conditions. <br /><br />Devising strategies can be broken down into:<br />creating and assessing conditions for actions,<br />weight of goals, estimates of cost for actions,<br />estimates of effectiveness of actions, finding related strategies,<br />taking strategies apart,<br />combining strategies,<br />covering contingencies,<br />evaluating strategies</div>
</div>
<div class='comment'>
<div class='author'>Jiri Jelinek</div>
<div class='content'>
I would be interested to see the input from which this AI (when implemented) would be able to learn how to play the 5-in-a-row game.</div>
</div>
<div class='comment'>
<div class='author'>Jiri Jelinek</div>
<div class='content'>
This comment has been removed by the author.</div>
</div>
<div class='comment'>
<div class='author'>Daivd</div>
<div class='content'>
@acetoline No, the project is not abandoned, but thanks for asking :). I tend to post infrequent, overambitiously long posts, so a few weeks silence is normal.<br /><br />The reason the github activity is low is more silly. I am currently in something between the design and implementation stage, writing Python code with a few pseudocode elements and a lot of prose. For some reason, I have not considered this semi-code &quot;commit-worthy&quot;.<br /><br />I promise a github update this week.</div>
</div>
<div class='comment'>
<div class='author'>acetoline</div>
<div class='content'>
Hi, I noticed there hasn&#39;t been any activity on your blog or github lately. I hope you haven&#39;t abandoned the project.</div>
</div>
<div class='comment'>
<div class='author'>Jiri Jelinek</div>
<div class='content'>
Doesn&#39;t sound like a well scalable solution. Don&#39;t get overexcited/misled after some early luck in well defined toy worlds. With teaching by manual algorithm entry by techies, you aren&#39;t gonna get very far.</div>
</div>
<div class='comment'>
<div class='author'>Daivd</div>
<div class='content'>
Hopefully, yes, it should be able to solve general problems using more specialized algorithms working together. It will not, however, take a set of specialized algorithms (let&#39;s say playing chess, checkers, poker and backgammon) and produce a general game playing algorithm. That is not how it achieves generality.<br /><br />It is geared towards very technical users. It takes input tasks as snippets of code and gives a  set of inputs that makes the function output true. This is called function inversion and is a fairly simple way of describing puzzles and technical problems.<br /><br />If it turns out to be a useful system for solving these types of tasks (a big IF - no one has really been able to achieve that). It would be a very good base on which to build something that can communicate with non-technical users and interact with our fuzzy world. That is not it&#39;s primary purpose, though.</div>
</div>
<div class='comment'>
<div class='author'>Jiri Jelinek</div>
<div class='content'>
Can this &#39;AGI&#39; generate general algorithms from a set of relevant non-general algorithms? Will non-technical users be able to teach this AI by describing specific (/non-general) scenarios?</div>
</div>
<div class='comment'>
<div class='author'>Daivd</div>
<div class='content'>
@Jiri Swedes, Norwegians, Danes and many Finns can read Swedish. That makes up a good 0.3% of the earth population :).<br /><br />Actually, I will remove that. That source code is not for human consumption yet. It is just test cases for analyzing source code, written in an odd Lisp dialect. No actual code relating to implementing either any of the algorithms I write about or the Scheduler.</div>
</div>
<div class='comment'>
<div class='author'>Mentifex</div>
<div class='content'>
One way to build a strong AI is outlined in the <a href="http://mind.sourceforge.net/aisteps.html" rel="nofollow">http://mind.sourceforge.net/aisteps.html</a> and develops into a simple but gradually <a href="http://www.scn.org/~mentifex/AiMind.html" rel="nofollow">expandable AI Mind</a>.</div>
</div>
<div class='comment'>
<div class='author'>Jiri Jelinek</div>
<div class='content'>
Don&#39;t use Swedish in the source, man! &#39;Nobody&#39; can read that ;-)</div>
</div>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Compression, prediction and artificial intelligence]]></title>
    <link href="http://gurgeh.github.com/blog/2010/12/28/compression-prediction-and-artificial/"/>
    <updated>2010-12-28T00:00:00+01:00</updated>
    <id>http://gurgeh.github.com/blog/2010/12/28/compression-prediction-and-artificial</id>
    <content type="html"><![CDATA[<div class='post'>
Compression is one of the most powerful concepts in computing. For the normal computer user, compression is associated with making files smaller, so you can store more of them or send them faster over the Internet, but there is much more to it.<br /><br />An optimal compressor can (or could, if it existed) be used for prediction of the future of a sequence of events (weather, sports, stocks, political events, etc), for example by trying all possible continuations and examine how well it compresses given the history. Conversely, an optimal predictor that gives the correct probability of each possible next symbol can be used for optimal compression by using <a href="http://en.wikipedia.org/wiki/arithmetic_coding">arithmetic coding</a>.<br /><span class="Apple-style-span" style="font-size: large;"><br /></span><br /><span class="Apple-style-span" style="font-size: large;">Compression and prediction</span><br /><br /><i>This section on background theory contains possibly scary math and dense prose, but should be understandable for most programmers. Maybe re-read the sentences a couple of times.</i><br /><br />Ray Solomonoff has shown&nbsp;<a href="http://www.theworld.com/~rjs/chris1.pdf">[PDF]</a>&nbsp;&nbsp;that if we let Sk be the infinite set of all programs for a machine M, such that M(Sk) gives an output with X as prefix (i.e the first bits of the output is X), then the probability of X becomes the sum of the probabilities of all of its programs, where the probability of a program is&nbsp;2 ** (-|Sk|) if |Sk| is the length of the program in bits and &#8220;**&#8221; means &#8220;to the power of&#8221;. As X gets longer, the error of the predictions approach zero, if the error is calculated as the total squared probability difference.<br /><br />A technicality is that only those programs count, that does not still produce X, when the last bit of the program is removed.<br /><br />To give a slightly more concrete example, say that you have a sequence of events - a history - and encode those as a sequence of symbols, X. Let us further say that you have a machine, M, that can read a program S and output a sequence of symbols. If you have no further information on your sequence of events, then the best estimate for the probability of a symbol Z to occur next (i.e the best prediction) is given by the set of all programs that output your history X followed by Z. Programs which output X+Z and are short are weighted higher (the 2 ** (-|Sk|) part).<br /><br /><br /><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">Even more concretely, given the binary sequence 101010101, you wonder what the probability is that the next bit will be 0 given that you know nothing else of this sequence. Sum 2 ** (-program length) for all programs that output 1010101010 vs those that output 1010101011 as their first bits (they are allowed to continue outputting stuff). If we call these sums sum0 and sum1 respectively, then the probability of 0 coming next is sum0 / (sum0 + sum1) and the probability of 1 coming next is sum1 / (sum0 + sum1).</div><br /><br />Obviously you cannot find all these programs by just trying every possible program, because 1) they are infinitely many and 2) given <a href="http://en.wikipedia.org/wiki/Halting_problem">the halting problem</a> you cannot in general know if a running program is in an infinite loop or if it will eventually output X.<br /><br />There is an area of probability theory called <a href="http://en.wikipedia.org/wiki/Minimum_description_length">Minimum description length</a>, where the language is chosen to be so simple (not Turing complete) so that you can actually find the shortest program or &#8220;description&#8221;. Calculating probabilities this way is very similar to Bayesian probability, but more general.<br /><br /><span class="Apple-style-span" style="font-size: large;">Solomonoff in (almost) practice</span><br /><br />Although the point of the theorem is not to apply it directly in practice, for short sequences X we can actually try. We can avoid problem 1 above, there are infinitely many programs, by generating random programs and see if they produce X. If they do, we count them. This way we can produce an approximation of what the true sum0 and sum1 are. If we set up our random generation such that shorter programs are more likely, then we don&#8217;t have to bother with the 2 ** (-program length) part and may just have a running count for each sum. If the sequence is too long, this method will be impractical since almost no randomly generated programs will actually output X.<br /><br />Problem 2 above, when we test programs they may not halt, is harder, but Levin has proposed a way around it. If we in addition to program length use running time (number of instructions executed) as a measure of the probability of our program, we can start by generating all the programs that we intend to test and then run them all in parallel. As our execution moves forward, we will get an increasingly accurate approximation of the true sum0 and sum1, without getting stuck on infinite loops.<br /><br />If we want to get even more practical, it can be shown that the shortest program that produces X will generally dominate the others and thus it will predict the most likely next symbol. That way you can just search for programs that output X and the currently shortest program will be your best guess. Since we no longer care about the relative probability of the next symbol, but only which is most likely, the search does not have to be random. Thus we can use any method we like for finding a short program. If you search for programs that produce X and find one that almost does, you can construct a &#8220;true&#8221; solution from that one, by constructing a prefix part that hard codes the places where your original program is wrong. This will produce a longer program, where the length, and thus the &#8220;score&#8221;, is the length of the prefix + the length of your faulty solution. The size of the shortest program that outputs X is called the <a href="http://en.wikipedia.org/wiki/Kolmogorov_complexity">Kolmogorov complexity</a> of X. The size of the shortest program that outputs X, measured in program size + log(running time), is called the <a href="http://www.scholarpedia.org/article/Algorithmic_complexity">Levin complexity</a> of X.<br /><br />One way to find these programs is to use <a href="http://en.wikipedia.org/wiki/Genetic_programming">genetic programming</a>, just take care that you don&#8217;t think that you can count the number of programs that produce X and get relative probabilities, because your search will now be skewed towards the solution (and thus its prediction) that you find first.<br /><br />A small problem is that depending on what machine you choose, i.e which instructions your programs can use and the length of these instructions, you will get different results. The method has a built in bias, since there is no one correct Turing complete language. This difference will however be smaller as X gets longer. One way to understand that is to note that any Turing complete language can emulate any other Turing complete language and that the size of such an emulator is finite. This is called the compiler theorem.<br /><br /><span class="Apple-style-span" style="font-size: large;">Compression is understanding and the Hutter Prize</span><br /><br />When we understand something, we can describe it succinctly. If I have an image of a red perfect circle, the size will be much larger if I describe the individual pixels rather than just say &#8220;a red circle of diameter d and thickness t&#8221;. When I understand what the image is depicting, I can describe it shorter. Sometimes a lossy compression of observed data will actually express the truth better than the exact data. If I take a photo of a red circle, the photo will probably not be perfect, but if I notice what the photo is showing, I can compress it as &#8220;a red circle&#8221; and some noise which I throw away, and suddenly my lossy compression is a better depiction of the truth.<br /><br />This equivalence between compression and general intelligence led Marcus Hutter to announce the <a href="http://prize.hutter1.net/">Hutter Prize</a>, where money is awarded for the best compression of 100 megabyte of English Wikipedia articles. So far the compression algorithms have been impressive (compressing the text to about 15%), but not shown much intelligence or understanding of the articles. When they do start to exhibit some understanding, I think that if they are allowed to compress the data in a slightly lossy way, the first thing that will happen is that some spelling and layout mistakes will be corrected, because these will be surprising to the compressor and thus demand an unusually long representation.<br /><br />Matt Mahoney has written a good rationale on the Hutter Prize <a href="http://cs.fit.edu/~mmahoney/compression/rationale.html">here</a>.<br /><br /><span class="Apple-style-span" style="font-size: large;">Compression in practice - the juicy stuff</span><br /><br />Compression is a powerful tool to measure success and avoid overfitting in a variety of common AI problems. The methods I laid out here are interesting mostly from a theoretical perspective, because of their prohibitively long running times.&nbsp;In my next post, I will expand on my thoughts on how you can use these results to get actual, practical algorithms for common AI problems.<br /><span class="Apple-style-span" style="font-size: large;"><br /></span></div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Rescuing a hosed system using only Bash]]></title>
    <link href="http://gurgeh.github.com/blog/2010/08/27/rescuing-hosed-system-using-only-bash/"/>
    <updated>2010-08-27T00:00:00+02:00</updated>
    <id>http://gurgeh.github.com/blog/2010/08/27/rescuing-hosed-system-using-only-bash</id>
    <content type="html"><![CDATA[<div class='post'>
<span class="Apple-style-span" style="font-size: x-large;">Prelude</span><br />Yesterday I was in a productive mood. &#8220;Let&#8217;s upgrade the ancient Gentoo Linux install on that server that nobody dares to use because the OS is too shoddy&#8221;, I thought. Since the Gentoo image was from 2005 and never updated, it seemed impossible to upgrade it using normal methods. There were dependencies blocking each other and just an all around awful mess.&nbsp;I downloaded the latest install tarball and decided to just extract it right over the old install. &#8220;What is the worst thing that can happen&#8221;, right?<br /><br />As it turned out, nothing special happened. It all worked smoothly. Until I ran &#8220;emerge&#8221; - the package manager for Gentoo. It decided that all those installed packages were quite unnecessary and proceeded to uninstall everything. <b>Everything. </b>Until it could uninstall no more, because it had broken itself.<br /><br /><span class="Apple-style-span" style="font-size: x-large;">Challenge</span><br />Now I had a system without anything in /bin /sbin /usr/bin, etc. Everything was gone. All that I had left was two remote ssh connections from my desktop which, quite heroically, stayed up despite the best efforts of emerge. I could not open any new connections. The server itself is located on a magical island, far, far away, called Hisingen. I had no intention of making a trip there. Yet.<br /><br />Ok, what can we do with no binaries?<br />This is pretty much it:<br /><a href="http://www.gnu.org/software/bash/manual/html_node/Bash-Builtins.html">http://www.gnu.org/software/bash/manual/html_node/Bash-Builtins.html</a><br /><br />Notice that such practical commands as &#8220;ls&#8221;, &#8220;mv&#8221;, &#8220;ed&#8221; and &#8220;cp&#8221; are not built in. This means that we cannot list or copy files. Or rename them. Or move them. Or edit them. &#8220;echo&#8221; and &#8220;cd&#8221; is ok, though. Also we can create new files with echo &#8220;blabla&#8221; &gt; theFile.<br /><br />&#8220;Bwaha! All I have to do is use tab completion to see what files are in a directory&#8221;. I chuckled triumphantly to myself, my seductive beard dancing in the wind. Luckily tab completion reported that my /bin/ was full of executables. Unluckily /bin/ was not actually full of executables when I tried to run them. It seems that Bash or Linux or someone had cached the tab completion results.<br /><br />Since I had the Gentoo tbz-image still in the root directory, all I needed was a way to extract that and I would have all my precious programs back.<br /><br /><span class="Apple-style-span" style="font-size: x-large;">Remote file copy</span><br /><br />OK.. how do I get bzip2 and tar to the server? Well, using echo &#8220;&#8230;.&#8221; &gt; file, it is possible to create new files. But how would you write binary data using echo? It turns out that one can write any byte using \x-hexadecimal escape codes. Unfortunately if you write the zero-byte, \x00, echo terminates. Executables or full of zero-bytes so we need a way to write them too. Well, it turns out that echo can write zero-bytes without terminating using octal escape codes - \0000 will do the trick.<br /><br />I created a Python program for taking a binary file and convert it to several lines of the type:<br /><br /><blockquote>&gt; echo -en $&#8217;\x6e&#92;0000\x5f\x69\x6e\x69\x74&#92;0000\x6c\x69\x62\x63\x2e\x73\x6f\x2e\x36&#92;0000\x66\x66\x6c\x75\x73\x68&#92;0000\x73\x74\x72\x63\x70\x79&#92;0000\x66\x63\x68\x6d\x6f\x64&#92;0000\x65\x78\x69\x74&#92;0000\x73\x74\x72\x6e\x63\x6d\x70&#92;0000\x70\x65\x72\x72\x6f\x72&#92;0000\x73\x74\x72\x6e\x63\x70\x79&#92;0000\x73\x69\x67\x6e\x61\x6c&#92;0000\x5f\x5f\x73\x74\x61\x63\x6b\x5f\x63\x68\x6b\x5f\x66\x61\x69\x6c&#92;0000\x73\x74\x64\x69\x6e&#92;0000\x72\x65\x77&#8217; &gt; myfile</blockquote><blockquote>&gt; echo -en $&#8217;\x69\x6e\x64&#92;0000\x69\x73\x61\x74\x74\x79&#92;0000\x66\x67\x65\x74\x63&#92;0000\x73\x74\x72\x6c\x65\x6e&#92;0000\x75\x6e\x67\x65\x74\x63&#92;0000\x73\x74\x72\x73\x74\x72&#92;0000\x5f\x5f\x65\x72\x72\x6e\x6f\x5f\x6c\x6f\x63\x61\x74\x69\x6f\x6e&#92;0000\x5f\x5f\x66\x70\x72\x69\x6e\x74\x66\x5f\x63\x68\x6b&#92;0000\x66\x63\x68\x6f\x77\x6e&#92;0000\x73\x74\x64\x6f\x75\x74&#92;0000\x66\x63\x6c\x6f\x73\x65&#92;0000\x6d\x61\x6c\x6c\x6f\x63&#92;0000\x72\x65\x6d&#8217; &gt;&gt; myfile</blockquote><blockquote>&gt; echo -en $&#8217;\x6f\x76\x65&#92;0000\x5f\x5f\x6c\x78\x73\x74\x61\x74\x36\x34&#92;0000\x5f\x5f\x78\x73\x74\x61\x74\x36\x34&#92;0000\x67\x65\x74\x65\x6e\x76&#92;0000\x5f\x5f\x63\x74\x79\x70\x65\x5f\x62\x5f\x6c\x6f\x63&#92;0000\x73\x74\x64\x65\x72\x72&#92;0000\x66\x69\x6c\x65\x6e\x6f&#92;0000\x66\x77\x72\x69\x74\x65&#92;0000\x66\x72\x65\x61\x64&#92;0000\x75\x74\x69\x6d\x65&#92;0000\x66\x64\x6f\x70\x65\x6e&#92;0000\x66\x6f\x70\x65\x6e\x36\x34&#92;0000\x5f\x5f\x73\x74\x72\x63&#8217; &gt;&gt; myfile</blockquote><blockquote>&gt; echo -en $&#8217;\x61\x74\x5f\x63\x68\x6b&#92;0000\x73\x74\x72\x63\x6d\x70&#92;0000\x73\x74\x72\x65\x72\x72\x6f\x72&#92;0000\x5f\x5f\x6c\x69\x62\x63\x5f\x73\x74\x61\x72\x74\x5f\x6d\x61\x69\x6e&#92;0000\x66\x65\x72\x72\x6f\x72&#92;0000\x66\x72\x65\x65&#92;0000\x5f\x65\x64\x61\x74\x61&#92;0000\x5f\x5f\x62\x73\x73\x5f\x73\x74\x61\x72\x74&#92;0000\x5f\x65\x6e\x64&#92;0000\x47\x4c\x49\x42\x43\x5f\x32\x2e\x34&#92;0000\x47\x4c\x49\x42\x43\x5f\x32\x2e\x33&#92;0000\x47\x4c\x49&#8217; &gt;&gt; myfile</blockquote><div>Taking care to escape all the backslashes properly turned out to be a bit of a challenge.&nbsp;<i>Fun fact</i>: if you write the hex code for backslash twice, \x53\x53, Bash will first convert them to backslash and then echo will interpret them as a new escape code and convert them to one backslash.</div><div><br /></div><div>Now I could cut and paste (very) small binaries, but I needed to paste a few megabytes. &#8220;Why a few megabytes?&#8221; you wonder. Well, since emerge removed all libraries as well, I had to compile the executables with all libraries linked statically. As it turns out, this makes a small utility much larger.</div><div><br /></div><div><span class="Apple-style-span" style="font-size: x-large;">Enter Konsole and DBUS</span></div><div>Konsole is a wonderful terminal program. Not only can I write stuff in it and make the text green on black and pretend I am Neo from &#8220;the Matrix&#8221;, I can also control it programmatically via DBUS. This means that I could write a Python program that sends characters to one of my sessions. I had to divide the file up into several messages of the form I showed above, and then send them. If I sent the messages too quickly, they got garbled and everything became a mess, so after each message I had to sleep for a short time.</div><div><br /></div><div>Using this method, I reached the staggering speed of 1K (yes, a thousand bytes) per second. Not quite as snappy as my over fifteen year old 14.4K modem, that could in theory reach 14400 bits per seconds.</div><div><br /></div><div>I think that the final program turned out to be quite useful. Using it, I can send a file from one terminal to another.</div><div><br /></div><div><span class="Apple-style-span" style="font-size: x-large;">Run, Forest, run!</span></div><div>A small problem turned up. How do I execute my executables? Chmod is not accessible and umask, which is a Bash builtin, just sets the maximum allowed privileges, rather than actually deciding how new files are created. As far as I know this problem is unsolvable, if not for a tiny cheat.</div><div><br /></div><div>If you pipe text into a file that is already executable, the resulting file will be executable, even if you overwrite the old file with &#8220;&gt;&#8221;. Since we had a few executable script files lying around in /home, which emerge could not uninstall, it was a simple matter of finding an executable script file and overwriting it.</div><div><br /></div><div>If I had not had any executables, I still hoped that /proc would contain executable links to the still running programs, and that I somehow could pick an unimportant one (without knowing which is which, since I still cannot execute ls or cat or anything like that, remember?) and overwrite it. If Linux would let me.</div><div><br /></div><div><div>Using my trans-terminal copier, I managed to get the 800K&nbsp;<a href="http://www.busybox.net/">busybox</a>&nbsp;(a wonderful tool, which emulates all the standard Linux commands and then some)&nbsp;to my broken server, under the guise &#8220;feedback.py&#8221;. This turned out to pose a new problem, since busybox refuses to run under any other name than busybox or one of it&#8217;s commands. This is because busybox will check under what name it was called and emulate that command.&nbsp;Feedback.py was not one of the builtins, apparently.&nbsp;Now I needed a way to rename &nbsp;the file to busybox again, so I had to statically compile GNU coreutils (./configure LDFLAGS = &#8220;-static&#8221; is your friend) and transfer &#8220;cp&#8221;. All 700K of it.</div></div><div><br /></div><div><span class="Apple-style-span" style="font-size: x-large;">Summing up</span></div><div><br /></div><div>Even if I had not had a Gentoo install image lying around, it would not have posed a problem by now, since busybox includes both wget and ftp.</div><div><br /></div><div>I extracted my install image and without doing anything further, I could suddenly make new ssh connections again! Feeling quite heroic, I decided to blog about it, since someone else (or I in the future, God forbid) might find it useful. And here we are.</div><div><br /></div><div>Since the terminal-copy program could also conceivably be useful for someone else, I will post it somewhere public.</div><div><br /></div></div>
<h2>Comments</h2>
<div class='comments'>
<div class='comment'>
<div class='author'>Noel D</div>
<div class='content'>
Thank you so much. I renamed a system file on AIX libcrypt.a as root and then none of the commands worked.</div>
</div>
<div class='comment'>
<div class='author'>hsteoh</div>
<div class='content'>
Thanks SO MUCH for this page!!! The past weekend, I accidentally hosed the dynamic linker on my remote server, and nothing could run (ssh can&#39;t even start a new session). The only thing left was a root bash shell over the last ssh connection I have to the box. Using your echo trick, I was able to copy a statically-linked busybox over, and recover the system.<br /><br />Just one note, though: you don&#39;t really need konsole or a python script to spool the lines over to the shell; the reason you can&#39;t just copy-n-paste 2000+ echo commands is because bash fiddles with terminal settings after every command, causing the input buffer to lose some characters after every 10 commands or so. The solution is to use downdiagonal&#39;s cat trick to copy the echo commands into a *script* on the target machine, then use (source script_filename) to run it to recreate the binary file. While bash is in the read line loop, it doesn&#39;t fiddle with the terminal, so it will actually be able to read 2000+ lines of echo commands without any mangling. (You can use a utility like xclip to copy an entire file into X11&#39;s clipboard, then transmit the whole thing with a single paste operation.)<br /><br />Of course, the cat trick adds another layer of interpolation, so you&#39;ll need to double-escape all your backslashes, etc..<br /><br />Using this method, I was able to copy about 2705 echo commands using a single copy-n-paste operation to recreate busybox.<br /><br />Thanks again, you saved my life!!</div>
</div>
<div class='comment'>
<div class='author'>previouslysilent</div>
<div class='content'>
I&#39;ve never had to recover from such a badly broken system as this, but have had to cope with a system when /lib and /usr/lib have become corrupted so that the dynamically linked /bin and /usr/bin exes have become unusable, so this was an interesting article, thanks for blogging.<br /><br />As ever, the first rule of hosing a system is DONT PANIC. It&#39;s very easy to charge in and make things worse, when some careful preservation of what still works can allow you to recover from what might seem a hopeless situation!</div>
</div>
<div class='comment'>
<div class='author'>downdiagonal</div>
<div class='content'>
I don&#39;t think there&#39;s any way to emulate chmod without some herculean effort, but I would love to see it if someone knows a way.<br /><br />Next time, instead of trying to get a copy of busybox to the machine, it would probably be better to use a copy of sash (stand-alone shell). It has much of the functionality of coreutils as shell builtins instead of using symbolic links and changing its behavior based on argv[0] as busybox does.<br /><br />Another useful trick to get files on to the machine is to use bash&#39;s net redirections to grab files from a working machine. You could pretty easily hack together a rudimentary replacement for wget by reading from and writing to /dev/tcp.<br /><br />As far as overwriting links in /proc goes, I was under the impression that /proc only contains symbolic links to executables. For example:<br /><br />$ file /proc/27252/exe <br />/proc/27252/exe: symbolic link to `/bin/bash&#39;<br /><br />When you try to overwrite a broken symbolic link it creates a new file where the link is pointing with default permissions, so I don&#39;t think that that would help.</div>
</div>
<div class='comment'>
<div class='author'>Daivd</div>
<div class='content'>
@downdiagonal and @h-i-r.net<br />Interesting.. My shell scripting knowledge is clearly lacking, but I think (hope..) I could have written that cat if I really needed ;).<br /><br />The problem is that even with the bash-cat, I would not have been able to make a good cp, because it would not be copying the execute permissions.<br /><br />If anybody has any suggestions on how to construct a chmod or execute a file without execute permissions, that would be very interesting and quite impressive.<br /><br />Maybe the trick I suggested with overwriting executable hard links in /proc would work?</div>
</div>
<div class='comment'>
<div class='author'>downdiagonal</div>
<div class='content'>
Emulate cat with just bash:<br /><br />(IFS=$&#39;\n&#39;;while read line;do echo &quot;$line&quot;;done) &lt; file.ext</div>
</div>
<div class='comment'>
<div class='author'>Ax0n</div>
<div class='content'>
Err&#8230; I meant echo, not cat.</div>
</div>
<div class='comment'>
<div class='author'>h-i-r.net</div>
<div class='content'>
That&#39;s some useful stuff I hadn&#39;t thought of before. I forgot cat would decode hex. Here&#39;s some stuff I whipped up that might have helped you as far as building out a few things for ls, ps and the like: http://www.h-i-r.net/2009/08/cratered-your-linux-box-here-are-some.html</div>
</div>
<div class='comment'>
<div class='author'>Daivd</div>
<div class='content'>
@Jeff Goldschrafe<br />&quot;cat&quot; is a program. All my programs were gone. I only had access to what Bash has built in.</div>
</div>
<div class='comment'>
<div class='author'>Daivd</div>
<div class='content'>
@Bjartr<br />Neat trick! I did not think of that. Before I did anything else, I found someone who had implemented ls in 1017 bytes using assembler and pasted the binary semi-manually:<br />http://www.muppetlabs.com/~breadbox/software/tiny/<br /><br />It is actually quite a capable version of ls. Later it let me see the size of the files I had transfered.</div>
</div>
<div class='comment'>
<div class='author'>Jeff Goldschrafe</div>
<div class='content'>
Couldn&#39;t you have emulated cp with &quot;cat somefile &gt; someotherfile&quot; or some such?</div>
</div>
<div class='comment'>
<div class='author'>Bjartr</div>
<div class='content'>
Instead of tab completion to find what&#39;s in a directory you can use echo *</div>
</div>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Milliblogging - An essay in 7 tweets]]></title>
    <link href="http://gurgeh.github.com/blog/2010/08/19/milliblogging-essae-in-7-tweets/"/>
    <updated>2010-08-19T00:00:00+02:00</updated>
    <id>http://gurgeh.github.com/blog/2010/08/19/milliblogging-essae-in-7-tweets</id>
    <content type="html"><![CDATA[<div class='post'>
Twitter became popular because of its 140 character restriction, orginally designed to fit an SMS.<br /><br />This restriction promises the follower that everything will be easily digested, but perhaps more important is the benefit to the writer.<br /><br />Often someone starts a blog, but then quits when every post becomes too long. Too ambitious. Twitter frees the writer of such pressure.<br /><br />Pressure to be concise is great, but I propose that the 140 char limit is too limiting to express anything interesting. Tweets are dumb.<br /><br />Instead of microblogging, I propose #milliblogging. You must constrict yourself to (maybe?) 1000 characters - the length of 7 tweets.<br /><br />Would you join such a site? Would you rather it be an extended Twitter-client or a new community?<br /><br />#Milliblogging - for people with something to say.<br /><br />(This post was also tweeted by me on <a href="http://www.twitter.com/fnedrik">http://www.twitter.com/fnedrik</a> )</div>
<h2>Comments</h2>
<div class='comments'>
<div class='comment'>
<div class='author'>Daivd</div>
<div class='content'>
I started out with the target of 5 tweets / 700 characters, but I found out that to express the idea itself, I needed 7 tweets.</div>
</div>
<div class='comment'>
<div class='author'>RBerenguel</div>
<div class='content'>
Interesting idea. I don&#39;t know. Usually my posts are way long&#8230; Maybe I can try to write my next post as 7 tweets to try.<br /><br />Ruben</div>
</div>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Digital immortality, true AI and the destruction of mankind]]></title>
    <link href="http://gurgeh.github.com/blog/2010/06/16/digital-immortality-true-ai-and/"/>
    <updated>2010-06-16T00:00:00+02:00</updated>
    <id>http://gurgeh.github.com/blog/2010/06/16/digital-immortality-true-ai-and</id>
    <content type="html"><![CDATA[<div class='post'>
Around the world a few people and companies are working towards the goal of <a href="http://en.wikipedia.org/wiki/Artificial_general_intelligence">strong AI</a>&nbsp;or artificial general intelligence, which is more or less the same thing. Too few, if you ask me.<br /><br />Today I was reminded of a strategy towards AGI that I have sometimes dreamed of. It is not the strategy I believe most in, but I think it is interesting nonetheless.<br /><br />What if you (assuming you are a competent programmer), from now on, tried to automate as many of your computer tasks as possible. Instead of doing something, try making the computer do it. Even if it takes you ten or a hundred times as long, your time investment will hopefully pay dividends in the future. Starting out, many things will be out of reach, but you will slowly build a knowledge base and an algorithm base that can mimic your preferences and skills. this will enable you to take on harder tasks and so on. You are building a digital assistant from the ground up.<br /><br />Maybe this undertaking is too ambitious for one person, especially if they actually wanted to get something done besides building a digital assistant. In that case, my proposal stands, but instead use a small team (Google and Microsoft, I know you have a few talented guys to spare for a grand project) that tries to automate the computer tasks of one guy.<br /><br />Take email, for example. Propose automatic actions on incoming mail, including replies, forwards and adding stuff to the calendar. Initially, very few emails will be understandable, but gradually I expect the algorithm to get better at parsing language and to get a better model of the user and the world. Perhaps one part of the knowledge base is building a <a href="http://en.wikipedia.org/wiki/Bayesian_network">Bayesian network</a>&nbsp;that models the user&#8217;s preferences. The important thing is: solve the emails one at a time, using as general rules as you can get away with, but as specialized rules as you practically have to.<br /><br />Want to look something up on Wikipedia? Make travel plans online? Make a purchasing decision? Solve it in code, as general as you can. When the AI is further advanced, you start to write documents and code collaboratively, and so forth. One way of developing the AI is to let it observe your digital life and all your actions. Ultimately, what you end up with is a digital model of yourself, that gets more and more like the original. It answers mail, reads news and maybe comments on it in tweets and blogs. In effect, you achieve digital immortality.<br /><br />Obviously, the weakness here is that I have not proposed what algorithms should be used for this digital alter ego. I do, however, feel that the task of general AI will benefit from both clever general algorithms and clever specialized algorithms and specialized knowledge. An organic hodge-podge of hacks and patches, very much like the brain itself.<br /><br />When a mathematician approaches the problem of AGI, they want a clean solution. One algorithm to bind them all, like the <a href="http://www.idsia.ch/~juergen/goedelmachine.html">Gödel Machine</a>&nbsp;of&nbsp;<a href="http://www.idsia.ch/~juergen/">Jürgen Schmidhuber</a>&nbsp;and <a href="http://www.hutter1.net/">Marcus Hutter</a>. When engineers approach the same problem, they tend to engineer grand designs, like <a href="http://en.wikipedia.org/wiki/Ben_Goertzel">Ben Goertzel</a>&#8217;s <a href="http://novamente.net/">Novamente</a>&nbsp;and <a href="http://www.opencog.org/wiki/The_Open_Cognition_Project">OpenCog</a>. I can certainly see the charm of both approaches and I hope that they succeed, but maybe the most practical way forward is just to tackle one small real-world task at a time - the &#8220;guided hodge-podge&#8221; approach.<br /><br />About the destruction of mankind? No, I don&#8217;t think we will have any of that, but some smart people do. Like Michael Anissimov: <a href="http://www.acceleratingfuture.com/michael/blog/2007/04/why-is-ai-dangerous/">&#8220;Why is AI dangerous?&#8221;</a>. Still, a title is better when it involves destruction, don&#8217;t you think?</div>
<h2>Comments</h2>
<div class='comments'>
<div class='comment'>
<div class='author'>Michael Anissimov</div>
<div class='content'>
Why don&#39;t you think there&#39;s a risk from AI destroying us?  Just context-insensitive optimism, or what?  I suggest you check out Stephen Omohundro&#39;s paper, &quot;Basic AI Drives&quot;, and also Eliezer Yudkowsky&#39;s &quot;AI as a Positive and Negative Factor in Global Risk.</div>
</div>
<div class='comment'>
<div class='author'>Harley Mellifont</div>
<div class='content'>
The problem with most strong AI advocates like yourself is that you think that all it takes to be intelligent is to behave intelligently. This is false. In psychology, can you truly determine someones intent from their behaviour alone? Take Steven Hawking for example, he can communicate only through technology. If he didn&#39;t have that technology, he would still be highly intelligent, but not according to you since he can&#39;t behave intelligently. There is far more to intelligence than behaviour: understanding is what needs to be modelled.<br /><br />I suggest you read the book On Intelligence by Jeff Hawkins.</div>
</div>
<div class='comment'>
<div class='author'>Guili</div>
<div class='content'>
&quot;solve the emails one at a time, using as general rules as you can get away with, but as specialized rules as you practically have to&quot;.<br />From what I learned doing my PhD in NLP, the more emails you &quot;solve&quot; (i.e. parse the syntax, disambiguate the 67 possible solutions, analyze the sense of each word in its context, combine these senses, disambiguate the 670 possible interpretations), the more your general rules/ specific rules ratio will decrease. <br />I can&#39;t prove this, but it&#39;s based on my observation of several rule-based systems in NLP.</div>
</div>
<div class='comment'>
<div class='author'>Ian Ozsvald</div>
<div class='content'>
I&#39;m a part of one of those companies that&#39;s working on an AGI - the project is RIA (http://www.qtara.com/) and the beta looks just like the website. It can help you write email, use Skype, research, read the news - all via a natural language spoken interface.<br /><br />An open source equivalent by the very smart John is: http://code.google.com/p/open-allure-ds/<br /><br />I&#39;m building up a set of like-minded folk for the A.I.Cookbook, mostly using Python to solve useful problems. Some of the active projects are documented here: http://blog.aicookbook.com/ with a budding discussion group: http://groups.google.com/group/aicookbook<br /><br />Cheers,<br />Ian.</div>
</div>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Solving Sudoku with genetic algorithms]]></title>
    <link href="http://gurgeh.github.com/blog/2010/05/05/solving-sudoku-with-genetic-algorithms/"/>
    <updated>2010-05-05T00:00:00+02:00</updated>
    <id>http://gurgeh.github.com/blog/2010/05/05/solving-sudoku-with-genetic-algorithms</id>
    <content type="html"><![CDATA[<div class='post'>
I recently wrote a small Python library for <a href="http://en.wikipedia.org/wiki/Genetic_algorithm">genetic algorithms</a> (GA), called <a href="http://code.google.com/p/optopus">optopus</a>. One thing I tried when I played around with it was to solve a <a href="http://en.wikipedia.org/wiki/Sudoku">Sudoku</a> puzzle. There are plenty of efficient ways to solve Sudoku, but with my shiny new hammer, all problems look like nails.<br /><br />Also, I remember that I once read something from someone who tried a GA C-library on Sudoku and concluded that it was not a suitable problem. If I could solve it with my slick library, that random person on the internet, whose web page I might never find again but who may exist as far as you know, would certainly be proven wrong. A worthy cause.<br /><br /><span class="Apple-style-span" style="font-size: x-large;">Genetic algorithms</span><br />A genetic algorithm is a general way to solve optimization problems. The basic algorithm is very simple:<br /><ol><li>Create a population (vector) of random solutions (represented in a problem specific way, but often a vector of floats or ints)</li><li>Pick a few solutions and sort them according to fitness</li><li>Replace the worst solution with a new solution, which is either&nbsp;a copy of the best solution, a mutation (perturbation) of the best&nbsp;solution, an entirely new randomized solution or a cross between the two&nbsp;best solutions. These are the most common evolutionary operators, but you could dream up others that use information from existing solutions to create new potentially good solutions.</li><li>Check if you have a new global best fitness, if so, store the solution.</li><li>If too many iterations go by without improvement, the entire population&nbsp;might be stuck in a local minimum (at the bottom of a local valley, with a possible chasm&nbsp;somewhere else, so to speak). If so, kill everyone and start over at 1.</li><li>Go to 2.</li></ol>Fitness is a measure of how good a solution is, lower meaning better. This measure is performed by a fitness function that you supply. Writing a fitness function is how you describe the problem to the GA. The magnitude of the fitness values returned does not matter (in sane implementations), only how they compare to each other.<br /><br />There are other, subtly different, ways to perform the evolutionary process. Some are good and some are popular but bad. The one described above is called tournament selection and it is one of the good ways. Much can be said about the intricacies of GA but it will have to be said somewhere else, lest I digress completely.<br /><br /><span class="Apple-style-span" style="font-size: x-large;">Optopus and Sudoku</span><br />Using optopus is easy:<br /><br /><pre class="brush: python">from optopus import ga, stdgenomes<br /><br />#Now we choose a representation. We know that the answer to the puzzle must be some permutation of the digits 1 to 9, each used nine times.<br /><br />init_vect = sum([range(1,10)] * 9, []) # A vector of 81 elements<br />genome = stdgenomes.PermutateGenome (init_vect)<br /><br />#I made a few functions to calculate how many conflicts a potential Sudoku solution has. I'll show them later, but for now let us just import the package. I also found a puzzle somewhere and put it in the PUZZLE constant.<br /><br />import sudoku<br />solver = ga.GA(sudoku.ga_sudoku(sudoku.PUZZLE) , genome)<br /><br />#And now, when we have supplied the GA with a fitness function (ga_sudoku, which counts Sudoku conflicts) and a representation (genome), let us just let the solver do its magic.<br /><br />solver.evolve(target_fitness=0)<br /></pre><br />And in a few minutes (about 2.6 million iterations when I tried) the correct answer pops out!<br /><br />The nice thing about this method is that you do not have to know anything about how to solve a Sudoku puzzle or even think very hard at all. Note that I did not even bother to just let it search for the unknown values - it also has to find the digits that we already know (which should not be too hard with a decent fitness function, see below). The only bit of thinking we did was to understand that a Sudoku solution has to be a permutation of&nbsp;<i>[1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9]</i>, but this merely made the evolution part faster. If we wanted to make it faster still, we could make a genome type that let us say that there are actually nine separate vectors who are each guaranteed to be a permutation of 1 to 9. We could have thought even less and represented the solution by 81 ints who are all in the range 1 to 9, by using another genome type:<br /><br /><span class="Apple-style-span" style="font-size: small;"><span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">&gt;&gt; genome = stdgenomes.EnumGenome(81, range(1,10))</span></span><br /><br />The range argument to EnumGenome does not have to be a vector of integers, it could be a vector of any objects, since they are never treated like numbers.<br /><br />In my experiment this took maybe 15 - 30 minutes to solve. For more difficult Sudoku puzzles, I would definitely go with the permutation genome, since using EnumGenome increases the search space to 9^81 or <i>196627050475552913618075908526912116283103450944214766927315415537966391196809</i> possible solutions.<br /><br />FYI, this is the puzzle in sudoku.PUZZLE:<br /><br /><span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"><b>&nbsp;&nbsp;4|8 &nbsp;| 17</b></span><br /><span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"><b>67 |9 &nbsp;| &nbsp;&nbsp;</b></span><br /><span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"><b>5 8| 3 | &nbsp;4</b></span><br /><span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"><b>&#8212;&#8212;&#8212;&#8211;</b></span><br /><span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"><b>3 &nbsp;|74 |1 &nbsp;</b></span><br /><span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"><b>&nbsp;69| &nbsp; |78&nbsp;</b></span><br /><span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"><b>&nbsp;&nbsp;1| 69| &nbsp;5</b></span><br /><span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"><b>&#8212;&#8212;&#8212;&#8211;</b></span><br /><span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"><b>1 &nbsp;| 8 |3 6</b></span><br /><span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"><b>&nbsp;&nbsp; | &nbsp;6| 91</b></span><br /><span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"><b>24 | &nbsp;1|5 &nbsp;</b></span><br /><span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"><br /></span><br /><span class="Apple-style-span" style="font-family: inherit;">I think a Sudoku puzzle that is harder for humans would not be that much harder for optopus to solve, but I have not tested it.</span><br /><span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"><br /></span><br /><span class="Apple-style-span" style="font-family: inherit;"><span class="Apple-style-span" style="font-size: x-large;">Sudoku fitness function</span></span><br /><span class="Apple-style-span" style="font-family: inherit;">OK, so that was a ridiculously easy way to solve a Sudoku puzzle, but I skipped one part that is crucial to all GA - describing the problem using a fitness function. I had to do the following:</span><br /><br /><pre class="brush: python">DIM = 9<br /><br />def one_box(solution, i):<br />    """Extract the 9 elements of a 3 x 3 box in a 9 x 9 Sudoku solution."""<br />    return solution[i:i+3] + solution[i+9:i+12] + solution[i+18:i+21]<br /><br />def boxes(solution):<br />    """Divide a flat vector into vectors with 9 elements, representing 3 x 3<br />    boxes in the corresponding 9 x 9 2D vector. These are the standard<br />    Sudoku boxes."""<br />    return [one_box(solution, i) for i in [0, 3, 6, 27, 30, 33, 54, 57, 60]]<br /><br />def splitup(solution):<br />    """Take a flat vector and make it 2D"""<br />    return [solution[i * DIM:(i + 1) * DIM] for i in xrange(DIM)]<br /><br />def consistent(solution):<br />    """Check how many different elements there are in each row.<br />    Ideally there should be DIM different elements, if there are no duplicates."""<br />    return sum(DIM - len(set(row)) for row in solution)<br /><br />def compare(xs1, xs2):<br />    """Compare two flat vectors and return how much they differ"""<br />    return sum(1 if x1 and x1 != x2 else 0 for x1, x2 in zip(xs1, xs2))<br /><br />def sudoku_fitness(flatsolution, puzzle, flatpuzzle=None):<br />    """Evaluate the fitness of flatsolution."""<br />    if not flatpuzzle:<br />        flatpuzzle = sum(puzzle, [])<br />    solution = splitup(flatsolution)<br />    fitness = consistent(solution) #check rows<br />    fitness += consistent(zip(*solution)) #check columns<br />    fitness += consistent(boxes(flatsolution)) #check boxes<br />    fitness += compare(flatpuzzle, flatsolution) * 10 #check that it matches the known digits<br />    return fitness<br /><br />def ga_sudoku(puzzle):<br />    """Return a fitness function wrapper that extracts the .genes attribute from<br />    an individual and sends it to sudoku_fitness."""<br />    flatpuzzle = sum(puzzle, []) #just a precalcing optimization<br />    def fit(guy):<br />        return sudoku_fitness(guy.genes, puzzle, flatpuzzle)<br />    return fit<br /></pre><br /><span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"><br /></span><br /><span class="Apple-style-span" style="font-family: inherit;">I know. This made the solution less clean. Still, I made it verbose for readability, so it is perhaps less code than it looks.</span><br /><span class="Apple-style-span" style="font-family: inherit;"><br /></span><br /><span class="Apple-style-span" style="font-family: inherit;">Take that, random internet guy!</span><br /><span class="Apple-style-span" style="font-family: inherit;"><br /></span></div>
<h2>Comments</h2>
<div class='comments'>
<div class='comment'>
<div class='author'>Daivd</div>
<div class='content'>
@blob Unfortunately I have no variable length genomes in optopus, but it is easy enough to create one yourself. Look at BaseGenome and perhaps take some inspiration from FloatGenome.</div>
</div>
<div class='comment'>
<div class='author'>blob</div>
<div class='content'>
Is there a built-in Genome that allows for a variable number of objects?<br /><br />Thanks! (:</div>
</div>
<div class='comment'>
<div class='author'>Stefaan</div>
<div class='content'>
I coded a Java GA for solving sudoku this year in a Meta-Heuristics course I took post grad. The best solving time I obtained from a few runs of AL Escargot was 9 minutes 41 seconds on a 2.66ghz core2 workstation, other problems took from sub one to 40 odd seconds depending on the difficulty. In the end the GA used four different methods of applying selection pressure, 4 more methods to then pare up parents, six different crossover methods, a modified island model and population restarting conditions&#8230; Needless to say comparitively to other meta-heuristics it is a very inefficient method.</div>
</div>
<div class='comment'>
<div class='author'>Cleve</div>
<div class='content'>
Human puzzle solvers and computers use very different approaches for Sudoku.  Humans are good at finding patterns.  Computers use brute force with genetic algorithms or other forms of recursive backtracking.  Take a look at:<br />   http://www.mathworks.com/moler/exm/chapters/sudoku.pdf</div>
</div>
<div class='comment'>
<div class='author'>Jerren</div>
<div class='content'>
So there is a GA to solve Sudoku.  That means there must be a constructive algorithm to solve it too.<br />What is the constructive algorithm?</div>
</div>
<div class='comment'>
<div class='author'>StartBreakingFree.com</div>
<div class='content'>
Pretty cool, so did it solve one sudoku puzzle or did it evolve an algorithm to solve sudoku puzzles in general?<br /><br />Thanks for posting it!</div>
</div>
<div class='comment'>
<div class='author'>micks</div>
<div class='content'>
Very interesting way to solve sudoku.<br /><br />I am just curious, exactly how long did it take for you to solve Sudoku? <br />I have a solution that solves Sudoku (among other tougher problems) very fast. <br />I just want to know your number before telling(bragging, as some might say) about mine.<br /><br />I just wanna compare the speed of the two solutions, yours and mine.<br />Really good post.</div>
</div>
<div class='comment'>
<div class='author'>vaevictus-net</div>
<div class='content'>
Seems like genetic algorithms for sudoku solutions are a decent example of GA.  But I think it might be fun/interesting to use GA to build some sort of logic engine to solve sudokus.  :)</div>
</div>
<div class='comment'>
<div class='author'>Daivd</div>
<div class='content'>
@albert<br /><br />I agree, that is a good approach.</div>
</div>
<div class='comment'>
<div class='author'>albert</div>
<div class='content'>
This is better:<br />http://norvig.com/sudoku.html</div>
</div>
<div class='comment'>
<div class='author'>Doug</div>
<div class='content'>
Damn, This is mind numbing good&#8230;</div>
</div>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Gates in practice]]></title>
    <link href="http://gurgeh.github.com/blog/2010/04/27/gates-in-practice/"/>
    <updated>2010-04-27T00:00:00+02:00</updated>
    <id>http://gurgeh.github.com/blog/2010/04/27/gates-in-practice</id>
    <content type="html"><![CDATA[<div class='post'>
<div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">Remember the classification problem from <a href="http://fakeguido.blogspot.com/2010/04/confident-optimization-using-gates.html">part 1</a>? I wrote a small script to simulate solutions with different success rates.</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">For our first investigation, let us say that we have created 600 solutions. 400 of them have a true fitness 0.5 (a 50% chance to answer correctly - no better than random since our test cases are 50/50 positive and negative) and 200 of them have found something and have a true fitness of 0.6. One limitation to these simulations is that we assume that two different solutions with a fitness of 0.6 are totally uncorrelated on the tests, while in reality the solutions might very well have picked up on the same pattern and be completely correlated.</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">In our hypothetical situation we have 300 different test cases that are out of sample, i.e not used by the process that created the solutions.300 test cases might seem like very few, but the circumstances are favorable. The 50/50 split between positive and negative helps. If it was just one positive in a hundred, many more cases would be needed. If the test cases are weighted, with some more important than others to get right, you also need more test cases.</div><div><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">Running the 600 solutions through the 300 test cases, we might get the following histogram with 15 bins:</div><div class="separator" style="clear: both; text-align: center;"><a href="http://4.bp.blogspot.com/_-8MSZS6yWdk/S9VX31h9Q7I/AAAAAAAAAEA/6fYx09jTTRw/s1600/Gate+histogram+for+peaks.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="482" src="http://4.bp.blogspot.com/_-8MSZS6yWdk/S9VX31h9Q7I/AAAAAAAAAEA/6fYx09jTTRw/s640/Gate+histogram+for+peaks.png" width="640" /></a></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">If we could measure the fitness exactly, we would just see two bars - one at 0.5 and one at 0.6. Based on this graph, can we take the best solution and ask what is the probability that it is better than 0.55? 0.58? 0.62? It is hard to know. It really looks like there ought to be some solution better than 0.6.</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">I divided the 300 tests into three gates with 100 in each. In some of my tests, no solution made it through all the gates, suggesting that I either needed more tests or more solutions, so just to test the theory I increased the number of 0.5-solutions to 100 000 and 0.6 to 50 000.</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">Running the solutions through the gates with a cutoff of 0.55 (meaning that we search for solutions that have a true fitness of 0.55 or better and only let those through that have a measured fitness of 0.55 on each gate) yielded the following result:</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>C0 = 150000</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>C1 = 54774 (C0 / C1 = 2.74)</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>C2 = 35543 (C1 / C2 = 1.54)</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>C2 = 27891 (C2 / C3 = 1.27)</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">The ratios are decreasing nicely, as expected. I solved the equations in part 1 numerically, using my genetic algorithm library,&nbsp;<a href="http://code.google.com/p/optopus">optopus</a>, for Python, thus ending up with the following:</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>0.34 good solutions, with 0.82 passing through each gate</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>0.66 bad solutions, with 0.14 passing through each gate</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">This means that the predicted probability of drawing a good solution after each gate is:</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>0.34 (before any gate. True value is approx. 0.33)</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>0.75 (after one gate. True 0.75)</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>0.95 (0.95)</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>0.99 (0.99)</i></div><br /><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">So in this particular special case, our method gets exactly the correct answer!</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">Let us make it slightly harder and choose a cutoff that is not exactly in the middle between 0.5 and 0.6.</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">Running the solutions as above with cutoff 0.58 (we now ask how many are equal to or better than 0.58) yielded this:</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>C0 = 150000</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>C1 = 35450 (C0 / C1 = 4.13)</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>C2 = 19571 (C1 / C2 = 1.81)</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>C2 = 11994 (C2 / C3 = 1.63)</i></div><br /><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">Once again we get a nice decreasing ratio. Solving it numerically gets us these parameters</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>0.34 good solutions, with 0.62 passing through each gate</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>0.66 bad solutions with 0.04 passing through each gate</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">and these predicted probabilities of drawing a good solution after each gate:</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>0.34 (True: 0.33)</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>0.88 (0.88)</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>0.99 (0.99)</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>0.999 (0.999)</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">Ridiculously good!</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">Let us make it harder still and pick a cutoff above the top solutions. This violates the assumption in part 1, that we have at least one good solution.</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">With cutoff 0.62, I got this:</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>C0 = 150000</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>C1 = 16043 (C0 / C1 = 9.35)</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>C2 = 4733 (C1 / C2 = 3.39)</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>C3 = 1400 (C2 / C1 = 3.38)</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">Here something worrisome happens. The ratio stops decreasing. Since this ratio is supposed to asymptotically reach the inverse of Pg, the probability that a good solution pass the gate, this should trigger an alarm. If we say that the fitness of a good solution has to be better than or equal to 0.62 and have a cutoff of 0.62 in the gates, how can only one in 3.38 pass? We should always get at least half of the good solutions through.</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">Running the GA gives:</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>0.35 good, with a 0.3 probability to pass</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>0.65 bad with a 0.0004 probability to pass</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">Which is completely wrong, since there are only bad solutions if we try to find solutions that are 0.62 or better. Remember that all our solutions have a true fitness of 0.5 or 0.6. We again get a stern warning that something is wrong, since a 0.3 probability to pass is too low. We expect at least half of the good solutions to pass. More than half should pass if the solutions are noticeably above the cutoff value, but anything under 0.5 means we are in trouble.</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">Taking that warning into account, it would seem that the gate test worked here as well.</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><span class="Apple-style-span" style="font-size: x-large;">The problem</span></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">Does it always work? Sadly, no. As I hinted in my last post, there is a problem. Let us say that instead of two kinds of solutions, 0.5 and 0.6, we have 100 solutions distributed evenly between 0.3 and 0.7. Just sending this solution distribution through the 300 tests yields (15 bins):</div><div class="separator" style="clear: both; text-align: center;"><a href="http://4.bp.blogspot.com/_-8MSZS6yWdk/S9VX8HlcdWI/AAAAAAAAAEI/pcuKNok-R6s/s1600/Gate+histogram+for+even+dist.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="482" src="http://4.bp.blogspot.com/_-8MSZS6yWdk/S9VX8HlcdWI/AAAAAAAAAEI/pcuKNok-R6s/s640/Gate+histogram+for+even+dist.png" width="640" /></a></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">The distribution does not look very uniform anymore, but at least it looks like we have nothing over 0.7, which is true.</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">With cutoff 0.65, we get:</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>C0 = 40000</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>C1 = 4656 (C0 / C1 = 8.59)</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>C2 = 2385 (C1 / C2 = 1.95)</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>C3 = 1392 (C2 / C1 = 1.71)</i></div><div><br /></div><div>Everything looks all right. Running the GA gives:</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>0.17 good, with a 0.58 probability to pass</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>0.83 bad with a 0.18 probability to pass</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><br /><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>0.17 (True: 0.13)</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>0.87 (0.68)</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>0.99 (0.85)</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><i>0.999 (0.93)</i></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">As can be seen, the algorithm overestimates how many good solutions there are. This happens because one of our initial assumptions, that there are only either good or bad solutions, is wrong. Since the C-ratios decrease, it looks like we get progressively more and more good solutions, which we are, but at a smaller rate than indicated. What happens is that the better a solution is, the more likely it is to survive, so the gates make sure that the better of the bad solutions survive, which makes a larger ratio of the entire &#8220;population&#8221; of solutions survive each time.</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">Note that the answer will not always be overestimated. If the even distribution is located mainly to the right of the cutoff, I think that the answer will instead be underestimated, but I have not yet tested this. It might be worth mentioning that running with cutoff 0.72 yielded very bad C-ratios, so that still works as before.</div><br /><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><span class="Apple-style-span" style="font-size: x-large;">Gates as transfer functions</span></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">A series of test cases and a cutoff value can be seen as a transfer function on the distribution of solutions. When the solutions are run through a gate, the transfer function is <a href="http://en.wikipedia.org/wiki/Convolve">convolved</a>&nbsp;with the solution distribution to yield a new solution distribution. In the graph below, I have plotted the transfer functions for five different gates, with the same cutoff but with different number of test cases. Once again we have relatively few test cases, but just as we discussed earlier, if the test cases are not divided 50/50 between positive and negative or there are weights, we need many more to get the same gate characteristics.</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div class="separator" style="clear: both; text-align: center;"></div><div class="separator" style="clear: both; text-align: center;"><a href="http://3.bp.blogspot.com/_-8MSZS6yWdk/S9WV3GVHcHI/AAAAAAAAAEg/aJI4ZFzcr7c/s1600/Gate+Transfer+Functions+70.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="482" src="http://3.bp.blogspot.com/_-8MSZS6yWdk/S9WV3GVHcHI/AAAAAAAAAEg/aJI4ZFzcr7c/s640/Gate+Transfer+Functions+70.png" width="640" /></a></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">The more test cases that the gate has, the steeper the transfer function. It will look more and more like a step function as the gate gets &#8220;larger&#8221;. In the limit that is actually is a step function (with infinitely many test cases), our earlier assumption fully holds and the predictions would again be correct. These transfer functions will move the solution distribution to the right (better true fitness). The only thing we can measure is still C, the number of solutions left, which in the continuous case would be the integral of the solution distribution.</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><span class="Apple-style-span" style="font-size: x-large;">Possible solutions</span></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">If we knew the shape of our initial solution distribution and the shape of the transfer function, we would once again be in business and be able to predict the number of solutions above a cutoff correctly. The transfer function can be estimated, but the solution distribution can vary greatly. Maybe all solutions have the same true fitness? Maybe they are divided into two &#8220;pillars&#8221; as in the example above? Maybe they are spread in some more complex shape across the spectrum?</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">Instead of just finding out the value of two points of the solution distribution (good and bad), we would get better estimates if we could get the value of more points. But knowledge of more variables require more equations. Where will these equations come from? We have two sources of untapped information.</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">First of all, we have not used the exact result when the solutions are run against the test, only if they pass the cutoff or not. By letting the cutoff move from 0 to 1 and running the same gates over and over again, we can gain additional information on the shape of the solution distribution. If the transfer functions where linear, we would not gain additional information from this, but since they are not, the different resulting integrals can tell us something.</div><div class="separator" style="clear: both; text-align: center;"><a href="http://4.bp.blogspot.com/_-8MSZS6yWdk/S9VX_cQzHlI/AAAAAAAAAEY/AJtkYerEHtc/s1600/Gate+Transfer+Functions+with+9+Test+Cases.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="482" src="http://4.bp.blogspot.com/_-8MSZS6yWdk/S9VX_cQzHlI/AAAAAAAAAEY/AJtkYerEHtc/s640/Gate+Transfer+Functions+with+9+Test+Cases.png" width="640" /></a></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">The results of the different cutoffs are of course not totally independent, so each new cutoff will not magically bring about a new independent equation, but it will still improve our understanding of the shape of the solution distribution.</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">The other source of information that we have not used is that not all solutions run against all test cases, since we filter out so many after each gate. This means that there are tests (thus information) that are not being used. The easiest way to use this information is to do as suggested in <a href="http://fakeguido.blogspot.com/2010/04/confident-optimization-using-gates.html">part 1</a>&nbsp;and rerun the gate experiment with the gates in different order. This will not give more equations, but will increase the accuracy of the measured Cs which in turn may allow more gates with fewer tests in each, which gives more equations.</div><br /><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">This is my main approach and the one that I will test in practice and write my next post about.</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">I have, however, two other rough ideas that one could pursue.</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">One approach is to abandon gates and look at the measured fitness histogram for all test cases (like the two I have shown above) and try to run it in reverse, by generating hypothetical true distributions and see how likely they are to end up in the measured distribution after the blurring effect of the test process. Maybe image deblurring techniques such as Gaussian&nbsp;<a href="http://en.wikipedia.org/wiki/Noise_reduction">noise reduction</a>&nbsp;can be used?</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">The other approach is to study&nbsp;<a href="http://en.wikipedia.org/wiki/Deconvolution">deconvolution</a>&nbsp;- the process of running a convolution in reverse. According to the Wikipedia article, deconvolution is often associated with image processing, where it is used to reverse optical distortion. Maybe this method will actually converge with the deblurring approach?</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br /></div></div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Confident optimization using gates]]></title>
    <link href="http://gurgeh.github.com/blog/2010/04/14/confident-optimization-using-gates/"/>
    <updated>2010-04-14T00:00:00+02:00</updated>
    <id>http://gurgeh.github.com/blog/2010/04/14/confident-optimization-using-gates</id>
    <content type="html"><![CDATA[<div class='post'>
When dealing with any non-linear optimization or classification algorithm, like <a href="http://en.wikipedia.org/wiki/Genetic_algorithms">Genetic Algorithms</a>, <a href="http://en.wikipedia.org/wiki/Artificial_neural_network">Artifical Neural Networks</a> or <a href="http://en.wikipedia.org/wiki/Simulated_annealing">Simulated Annealing</a>, you need a way to compute the fitness of your candidate solutions. These algorithms all work in roughly the same way - you generate a solution, test it and generate new solutions based on feedback from the testing (the feedback will usually just consist of a fitness value).<br /><br />For some problems you can get the <i>true fitness</i> of a solution. If you, for example, are maximizing a known mathematical function of many variables, you immediately know exactly how good your solution is. However, for most interesting problems you will never know the true fitness. If you are evolving parameters for a poker playing program, a stock predictor or a walrus image classifier, you never know quite how good your solution is in general. The best you can do is try your solution on a number of test cases and assume that your average performance on those tests are the same as your average performance when the number of test cases approach infinity.<br /><br />An optimization algorithm will often become over-specialized in the test cases that it is trained on. To combat this, a method called &#8220;holdout validation&#8221; is often used. The data is divided up into several disjunct sets - a training set used for fitness calculation of the millions or billions of proposed solutions, a validation set for validating the fitness of solutions that are candidates of being the most promising so far and a test set for testing the final solution of your run. Often you will make several runs of your optimization algorithm with different types of inputs and parameters. The test set is used to decide which of the runs produced the best solution.<br /><br />This standard approach will sometimes work, but there are problems. If your fitness function is very volatile and test cases are hard to come by, you can never be quite sure how consistent your solution is. What is the chance that a solution will happen to get lucky on all three sets of test cases? Low? If I am persistent and keep running the same optimization problem on my computer cluster with the inputs prepared differently and different parameters until I get something that finally pass my tests, what are the risk that after billions and billions of tries, I have just found a fluke that will not perform well on further data? Obviously one can never be completely sure, but <b>it would be nice to at least know what the probability is that we have found a real solution</b>.<br /><br />One possibly more effective way of using your data is to employ&nbsp;<a href="http://en.wikipedia.org/wiki/Cross-validation_(statistics)">cross validation</a>, but if you test several solutions, the problems with &#8220;luck&#8221; will reappear.<br /><br />I&#8217;d like to explore a different theory that I have been tinkering with.<br /><br /><span class="Apple-style-span" style="font-size: x-large;">Gates - an introduction</span><br /><br />In this introduction, I will make a number of simplifications, assumptions and ungodly approximations. I hope to remedy this in the next post.<br /><br />Let us take the problem of walrus image recognition from the introduction. Imagine that we have a number of subaquatic images. Half of them depict sea weed, lobsters and whatnot and half are of walruses. A good walrus classifier will identify over 70% of the images correctly.<br /><br />Let us further make the assumption that a walrus classifier is either good or bad, with nothing in between. This assumption can not be entirely true, since an optimization algorithm needs a way to arrive at its solution through gradual improvement, which means that there must be somewhat good solutions. Nevertheless, it will have to do as an approximation for now.<br /><br />Assume we have a black-box algorithm that spits solutions at us. We will call the number of good solutions at a certain time <i>Sg</i> and the number of bad solutions <i>Sb</i>. The total number of solutions, <i>C0</i>, is just C0 = Sg + Sb.<br /><br />Take the images that the algorithm did not get to see and divide them randomly into three equally large sets. Each set is now a <i>Gate</i>, which will let through only those solutions that can correctly classify above 70% (it does not have to be the same as our target percentage) of the images in the set. We can assume that it is approximately equally hard to get through any gate. This assumption can be tested by simply sending our solutions through each gate and check that roughly the same number of solutions pass.<br /><br />Unfortunately there is a chance that a bad solution could get lucky and pass the gate (a false positive) or that a good solution could get unlucky and fail (a false negative), but for a meaningful test, a good solution must always have a better chance to pass the gate than a bad solution.<br /><br />Let us define the probability that a good solution passes a gate as <i>Pg</i> and the probability that a bad solution passes as <i>Pb</i>. As the number of test cases in each gate approaches infinity, Pg will approach 1 and Pb will approach 0.<br /><br />If we set up an experiment where we first send our solutions through the first gate, then send the survivors through the second and finally those survivors through the third gate, we can measure the remaining population size at four points. After the black box: C0, after the first gate: C1, second gate: C2 and third gate C3.<br /><br />If our gates are testing anything relevant and our population consists of both good and bad solutions (Sg != 0 and Sb != 0) we can immediately see that the ratio between successive gates should decrease C0/C1 &gt; C1/C2 &gt; C2/C3, because there will be a greater ratio of good solutions to bad solutions after each gate and Pg &gt; Pb. In plain English - a smaller percentage of the solutions should disappear each time, as the population gradually contains more good solutions. If this does not hold, we must either have bad gates (too few test cases or something) or no good solutions or no bad solutions. As the ratio of bad to good solutions decrease, C(i) / C(i+1) will approach Pg as most solutions will be good.<br /><br /><span class="Apple-style-span" style="font-size: x-large;">Fortuitous results</span><br /><br />The number of good solutions that remain after the first gate is Sg * Pg and the number of bad is Sb * Pb. Thus we get four equations:<br /><br />C0 = Sg + Sb<br />C1 = Sg * Pg + Sb * Pb<br />C2 = Sg * Pg<sup>2</sup> + Sb * Pb<sup>2</sup><br />C3 = Sg * Pg<sup>3</sup> + Sb * Pb<sup>3</sup><br /><sup><br /></sup><br /><div style="text-align: center;"><sup><span class="Apple-style-span" style="font-size: medium;"><a href="http://4.bp.blogspot.com/_-8MSZS6yWdk/S8XNQldi6DI/AAAAAAAAADw/DABY2O4j6lc/s1600/SimplifiedGates.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://4.bp.blogspot.com/_-8MSZS6yWdk/S8XNQldi6DI/AAAAAAAAADw/DABY2O4j6lc/s320/SimplifiedGates.png" /></a></span></sup></div><br />Since we have four equations and four unknowns, Sg, Sb, Pg and Pb, we should be able to solve what the ratio of good to bad solutions is and what the characteristics of the gates are. Subsequently we can tell what the chance is that we pick a good solution if we randomly pick one after a certain number of gates. We can also tell how many solutions we will need to generate on average until we have at least one solution that passes through a certain number of gates.<br /><br />There is nothing magical about three gates. If we use more gates with fewer tests in each (thus making Pg and Pb closer) we will get different characteristics. This will result in more equations and the variables will be overdetermined, but they can still be determined using, for example, a least-squares fitting. Trying different number of gates and different gate sizes can help us find the optimal use of our test cases.<br /><br />It is important that Pg and Pb are roughly the same for each gate, in other words that one gate is not significantly harder or easier to pass through than the others. If the gate sizes are large this is more likely to be true. It is straightforward to test this assumption. You can either:<br /><br /><ul><li>Do the simple test described earlier, making sure that C is roughly the same for each gate.</li><li>Put the gates in different order and re-run the experiment, verifying that the results are the same.</li><li>Run the experiment several times, dividing the test cases into three entirely new gates each time and determine the standard deviation of the calculated parameters.</li></ul><br />If C is not roughly the same for each gate, you need larger gates. Go find more data or use fewer gates, but no fewer than three. That simple. (Almost.. we have not defined exactly how rough &#8220;roughly&#8221; is)<br /><br /><span class="Apple-style-span" style="font-size: x-large;">Caveat</span><br />So.. Is everything solved that easily? Can we now go out and confidently optimize the world as the title suggests? No, but I believe this approximation can be of great use as it stands. Earlier we made the assumption that solutions are either good or bad, instead of somewhere in a continuous fitness spectrum. In my next post, I will explain why this makes things a bit more complicated.</div>
<h2>Comments</h2>
<div class='comments'>
<div class='comment'>
<div class='author'>Daivd</div>
<div class='content'>
@JP I show some testing here: http://fakeguido.blogspot.com/2010/04/gates-in-practice.html</div>
</div>
<div class='comment'>
<div class='author'>JP</div>
<div class='content'>
It seems reasonable course. have you tested it yet? I mean compared to other methods.</div>
</div>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Why planning is hard]]></title>
    <link href="http://gurgeh.github.com/blog/2007/12/21/why-planning-is-hard/"/>
    <updated>2007-12-21T00:00:00+01:00</updated>
    <id>http://gurgeh.github.com/blog/2007/12/21/why-planning-is-hard</id>
    <content type="html"><![CDATA[<div class='post'>
After my last post about planning I thought some more on the issue and had something close to an epiphany.<br /><br />When you plan, in the solitaire sense, you need rules governing what moves are legal - transformation rules. If you treat these rules like black boxes, just understanding them by playing around with positions and see how they behave, you can only do so much. An important rule might be usable extremely rarely, but nevertheless be the key to success if you specifically aim to reach a position where it is applicable. This means that you might miss how important a rule (or an exception to a rule) is, when just &#8220;black-boxing&#8221; it, because it&#8217;s usefulness or purpose might never come up.<br /><br />Even if you have no such rare rules in your system, the best you can hope for if you want to analyze a system when black-boxing is just to formulate your own internal rules for how the system seems to behave.<br /><br />Thus, <span style="font-weight: bold;">the reason planning is hard is that you need to be able to analyze/understand code to understand transformation rules in general</span> and understanding code is hard. You need to understand when you may take actions and what these actions do, i.e understand transformation rules, whatever system you are planning for.<br /><br /><span style="font-size:130%;">A small prediction<br /><br /></span>The reason why good planning is a key to analyze code and prove things in formal systems is that you need to understand code in order to plan. Thus, I postulate, when something can analyze code better than I, it will quickly learn to do everything else intelligence-based better than I. The implication probably works both ways, so the first program that is thoroughly smarter than I will most likely have it&#8217;s foundation laid upon the ability to reason about code.</div>
<h2>Comments</h2>
<div class='comments'>
<div class='comment'>
<div class='author'>Home Broker</div>
<div class='content'>
Hello. This post is likeable, and your blog is very interesting, congratulations :-). I will add in my blogroll =). If possible gives a last there on my blog, it is about the <A HREF="http://home-broker-brasil.blogspot.com" REL="nofollow">Home Broker</A>, I hope you enjoy. The address is http://home-broker-brasil.blogspot.com. A hug.<A HREF="8250556117" REL="nofollow"></A></div>
</div>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Deceptively simple game]]></title>
    <link href="http://gurgeh.github.com/blog/2007/12/12/deceptively-simple-game/"/>
    <updated>2007-12-12T00:00:00+01:00</updated>
    <id>http://gurgeh.github.com/blog/2007/12/12/deceptively-simple-game</id>
    <content type="html"><![CDATA[<div class='post'>
I would pay a handsome sum (say $1 million, if I could raise it) for a program that could do the following well-defined, seemingly simple, task.<br /><br /><span style="font-size:130%;">General solitaire solver</span><br /><br />Take as input a list of rules for a solitaire-like game. The rules are deterministic transformation rules, defining which moves are legal given a certain position. The rules will be given in whatever Turing complete language the solver likes. For example a simple Scheme dialect without side-effects or a subset of x86 machine code.<br /><br />As long as it solves the task, the solver is free to treat the rules as <span style="font-style: italic;">black boxes</span> that take one position and outputs a, possibly empty, list of potential positions.<br /><br />The solver will then take an initial position as second input and one or more target positions as final input. In fact, to make it more general, take a function that tells whether a position is the target or not.<br /><br />As output, I want a sequence of transformations that leads from the initial position to a target. It does not have to be the shortest sequence, just a sequence. Also I want the answer reasonably fast. At least as fast as I could solve it myself.<br /><br /><span style="font-size:130%;">Extra features</span><br /><br />While I would be very happy with just the above, here are some extra features that would be nice.<br /><br /><ul><li><span style="font-size:100%;">Instead of a binary target function, let me use a continuous target function, and give as output a sequence that gives an end position with as good a score as possible.</span></li><li><span style="font-size:100%;">Accept one or more opponents. This would be useful for playing games - go, shogi, chess, etc, but apart from that would probably be a step towards the stochastic thing below.</span></li><li><span style="font-size:100%;">Allow the transformation rules to behave in a stochastic/probabilistic manner.</span></li></ul><span style="font-size:130%;">Why?</span><br /><br />I have no particular desire to solve solitaire automatically, so why would I want a generalized solitaire solver? Well, if you can input the rules for solitaire, you can also input the rules for <a href="http://seedai.blogspot.com/search/label/Towers%20of%20Hanoi">Towers of Hanoi</a>, which I used as an example of difficult planning in another post. Suddenly you can solve a whole range of reasoning problems, mathematical proofs, reasoning about programs and all sorts of interesting and important stuff.<br /><br />It is interesting to think about the problem from the point of view of solving solitaire or some other simple one-man game. I think it makes it less intimidating.</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[The optimal IQ test]]></title>
    <link href="http://gurgeh.github.com/blog/2007/09/07/optimal-iq-test/"/>
    <updated>2007-09-07T00:00:00+02:00</updated>
    <id>http://gurgeh.github.com/blog/2007/09/07/optimal-iq-test</id>
    <content type="html"><![CDATA[<div class='post'>
The hardest part for me when thinking about seed AI and optimal optimization, is coming up with a good fitness (IQ) test.  Since you need the test to run fast, you end up testing that the algorithm can get somewhere fast, i.e checking only the extreme beginning of a performance curve that ultimately must continue to be good many thousand times longer. What we want to measure is something like the <a href="http://en.wikipedia.org/wiki/Big_O_notation">Big O</a> performance of the algorithm in the limit and not what it looks like the first second of it&#8217;s life. Another problem is that we want the intelligence to be as general as possible and not over-specialized on solving a few test cases.<br /><br /><span style="font-size:130%;">A fitness test of a fitness test</span><br /><br />Recently I got a new idea of what constitutes a good IQ test. Our current approach to seed AI is about developing a really good programmer that can program better versions of itself. A good fitness test is a test that has a high correlation between a program testing good on it and the same program being able to generate new programs that gets even better scores. Not only is this a necessary criterion. It might be <span style="font-weight: bold;">sufficient</span>. Any test of a program which means that this program is likely to produce new programs that perform well (strictly - reach a new global optimum) on the test, might be a good fitness test of what we are after. The test that produces new <span style="font-style: italic;">Masters </span>(see <a href="http://seedai.blogspot.com/2007/09/1250-press-return.html">this post</a>) most frequently might be the best test. Getting the most new Masters over time, also ensures that the test does not take unnecessarily long to run. I am not completely sure, but we might need to force all tests to start with a kernel of an intelligence test (compress this string, predict this numeric sequence, something like that), just to set it of in the right direction and eliminate trivial solutions, like giving all programs a random IQ from some distribution. The trivial solution of giving all programs the perfect IQ, would not be a candidate, because no new globally optimal solutions would be found, so no new Masters would come, and thus that fitness test would not test well on the fitness test test (am I making sense?).<br /><br />Having a fitness test of our fitness test suggests that we can start by evolving a good test, or even more beautifully, co-evolve solution and test.<br /><br />Perhaps I am just dreaming, but it sure would be a beautiful algorithm if it worked&#8230;</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Coincidence?]]></title>
    <link href="http://gurgeh.github.com/blog/2007/09/06/coincidence/"/>
    <updated>2007-09-06T00:00:00+02:00</updated>
    <id>http://gurgeh.github.com/blog/2007/09/06/coincidence</id>
    <content type="html"><![CDATA[<div class='post'>
I once read a short story about the creation of the world&#8217;s most powerful computer. In essence, each time they tried to turn it on, they had some minor misfortune, a power outage, the maid accidentaly tripped on, and unplugged, the power cord, etc. The highly technical twist in the end was that since we live in a <a href="http://en.wikipedia.org/wiki/Multiverse">Multiverse</a>, all things that can happen happens in a separate universe. It turns out that the computer was so advanced (or something) that it turned in to a black hole when switched on, destroying all life. Since the observers could only exist in the universes where the computer remained switched off, they experienced these &#8220;coincidences&#8221;, that protected them.<br /><br /><span style="font-size:130%;">A database of all human knowledge</span><br /><br />When I read up a bit on <a href="http://en.wikipedia.org/wiki/Cyc">Cyc</a>, the other day, I came upon a competing project that I, myself, once added some <span style="font-style: italic;">mindpixels</span> to.<a href="http://en.wikipedia.org/wiki/Mindpixel"> </a><blockquote><a href="http://en.wikipedia.org/wiki/Mindpixel">Mindpixel</a> was a web-based collaborative <a href="http://en.wikipedia.org/wiki/List_of_notable_artificial_intelligence_projects" title="List of notable artificial intelligence projects"></a>artificial intelligence project which aimed to create a database of millions of human validated true/false statements, or probabilistic propositions.</blockquote>Unfortunately the project is now defunct, since the founder Chris McKinstry committed suicide on <a href="http://en.wikipedia.org/wiki/23rd_January" title="23rd January"></a>23rd January, 2006.<a href="http://en.wikipedia.org/wiki/2006" title="2006"></a><br /><br />Well, never fear, because from the Mindpixel page on Wikipedia, we learn that <a href="http://en.wikipedia.org/wiki/Open_Mind_Common_Sense">Open Mind Common Sense</a> is a similar project, run by MIT, whose goal is to build a large common sense knowledge base from the contributions of many thousands of people across the Web.<br /><br />Unfortunately that<span style="font-weight: bold;"> </span>project is <span style="font-weight: bold;">also</span> stalling, since Push Singh who was slated to become a professor at the MIT Media Lab to lead the Commonsense Computing group in 2007, commited suicide on Tuesday, February 28, 2006. Just a month after the other visionary of web knowledge, Chris McKinstry.<br /><br />Let the unreasonable conspiracy theories commence.<br /><a href="http://en.wikipedia.org/wiki/Deaths_in_February_2006" title="Deaths in February 2006"></a></div>
]]></content>
  </entry>
  
</feed>
