How do functional programming languages work?

If functional programming languages cannot save any state, how do they do simple stuff like reading input from a user? How do they “store” the input (or store any data for that matter?)

For example: how would this simple C thing translate to a functional programming language like Haskell?

int main() {
    int no;
    return 0;

(My question was inspired by this excellent post: “Execution in the Kingdom of Nouns”. Reading it gave me some better understanding of what exactly object oriented programming is, how Java implements it in one extreme manner, and how functional programming languages are a contrast.)

functional and pure programming languages

What programming languages are functional and pure?

How does TDD compare with Functional Programming Languages?

How does TDD compare with Functional Programming Languages like F# and Erlang? I haven’t actually worked directly with a functional programming language yet, but from what I’ve seen of it, you have tw

Sorting in functional programming languages

I’ve been learning functional programming for some time, but I haven’t read somewhere about sorting with the functional programming languages. I know the sorting algorithms that are based on value exc

How difficult is it to learn functional programming languages? [closed]

Given that a persona has extensive experience in different imperative programming languages right from C to JAVA. Whats the learning curve for functional programming languages such a Haskell ?

How do I compare programming languages for projects at work?

I am wondering what are some specific questions I should keep in mind when I am comparing programming languages for use on given work projects. For instance, I am told logic programming languages like

Functional programming languages introspection

I’m sketching a design of something (machine learning of functions) that will preferably want a functional programming language, and also introspection, specifically the ability to examine the program

Flowcharting functional programming languages

Flowcharting. This ancient old practice that’s been in use for over 1000 years now, being forced upon us poor students, without any usefulness (or so do I think). It might work well with imperative, s

Which functional programming languages have bioinformatics libraries? [closed]

Which functional programming languages have bioinformatics libraries easily available? (Don’t include multi-paradigm languages such as Ruby) Update: Listing which major functional programming language

How does this javascript work, functional programming

In going through some exercises about functional programming, and returning a closure, came across this example, and cannot figure out how it works: function invoker (NAME, METHOD) { return function(t

Do concepts like Map and Reduce apply to all Functional Programming Languages?

I have just started delving into the world of functional programming. A lot of OOP (Object Oriented Programming) concepts such as inheritance and polymorphism apply to most modern OO languages like C#


Functional programing derives from lambda Calculus. If you truly want to understand Functional programing check out

It is a “fun” way to learn lambda Calculus and bring you into the exciting world of Functional programming!

How knowing Lambda Calculus is helpful in functional programming.

So Lambda Calculus is the foundation for many real-world programming languages such as Lisp, Scheme, ML, Haskell,….

Suppose we want to describe a function that adds three to any input to do so we would write:

plus3 x = succ(succ(succ x)) 

Read “plus3 is a function which, when applied to any number x, yields the successor of the successor of the successor of x”

Note that the function which adds 3 to any number need not be named plus3; the name “plus3” is just a convenient shorthand for naming this function

(plus3 x) (succ 0) ≡ ((λ x. (succ (succ (succ x)))) (succ 0))

Notice we use the lambda symbol for a function (I think it looks kind of like an Alligator I’m guessing thats where the idea for Alligator eggs came from)

The lambda symbol is the Alligator (a function) and the x is its color. You can also think of x as an argument (Lambda Calculus functions are really only suppose to have one argument) the rest you can think of it as the body of the function.

Now consider the abstraction:

g ≡ λ f. (f (f (succ 0)))

The argument f is used in a function position (in a call). We call g a higher-order function because it takes another function as an input. You can think of the other function calls f as “eggs“. Now taking the two functions or “Alligators” we have created we can do something like this:

(g plus3) = (λ f. (f (f (succ 0)))(λ x . (succ (succ (succ x)))) 
= ((λ x. (succ (succ (succ x)))((λ x. (succ (succ (succ x)))) (succ 0)))
 = ((λ x. (succ (succ (succ x)))) (succ (succ (succ (succ 0)))))
 = (succ (succ (succ (succ (succ (succ (succ 0)))))))

If you notice you can see that our λ f Alligator eats our λ x Alligator and then the λ x Alligator and dies. Then our λ x Alligator is reborn in the λ f’s Alligator eggs. Then the process repeats and the λ x Alligator on the left now eats the other λ x Alligator on the right.

Then you can use this simple set of rules of “Alligators” eating “Alligators” to design a grammar and thus Functional programming languages were born!

So you can see if you know Lambda Calculus you will understand how Functional Languages work.


main = do no <- readLn
          print (no + 1)

You can of course assign things to variables in functional languages. You just can’t change them (so basically all variables are constants in functional languages).

Functional language can save state! They usually just either encourage or force you to be explicit about doing so.

For example, check out Haskell’s State Monad.

(Some functional languages permit impure functions.)

For purely functional languages, the real world interaction is usually included as one of the function arguments, like this:

RealWorld pureScanf(RealWorld world, const char* format, ...);

Different languages have different strategies to abstract the world away from the programmer. Haskell, for instance, uses monads to hide the world argument.

But the pure part of functional language itself is already Turing complete, meaning anything doable in C is also doable in Haskell. The main difference to imperative language is instead of modifying states in place:

int compute_sum_of_squares (int min, int max) {
  int result = 0;
  for (int i = min; i < max; ++ i)
     result += i * i;  // modify "result" in place
  return result;

You incorporate the modification part into a function call, usually turning loops into recursions:

int compute_sum_of_squares (int min, int max) {
  if (min >= max)
    return 0;
    return min * min + compute_sum_of_squares(min + 1, max);

If functional programming languages cannot save any state, how do they do some simple stuff like reading input from a user (I mean how do they “store” it), or storing any data for that matter?

As you gathered, functional programming doesn’t have state—but that doesn’t mean it can’t store data. The difference is that if I write a (Haskell) statement along the lines of

let x = func value 3.14 20 "random"
in ...

I am guaranteed that the value of x is always the same in the …: nothing can possibly change it. Similarly, if I have a function f :: String -> Integer (a function taking a string and returning an integer), I can be sure that f will not modify its argument, or change any global variables, or write data to a file, and so on. As sepp2k said in a comment above, this non-mutability is really helpful for reasoning about programs: you write functions which fold, spindle, and mutilate your data, returning new copies so you can chain them together, and you can be sure that none of those function calls can do anything “harmful”. You know that x is always x, and you don’t have to worry that somebody wrote x := foo bar somewhere in between the declaration of x and its use, because that’s impossible.

Now, what if I want to read input from a user? As KennyTM said, the idea is that an impure function is a pure function that’s passed the entire world as an argument, and returns both its result and the world. Of course, you don’t want to actually do this: for one thing, it’s horribly clunky, and for another, what happens if I reuse the same world object? So this gets abstracted somehow. Haskell handles it with the IO type:

main :: IO ()
main = do str <- getLine
          let no = fst . head $ reads str :: Integer

This tells us that main is an IO action which returns nothing; executing this action is what it means to run a Haskell program. The rule is that IO types can never escape an IO action; in this context, we introduce that action using do. Thus, getLine returns an IO String, which can be thought of in two ways: first, as an action which, when run, produces a string; second, as a string that’s “tainted” by IO since it was obtained impurely. The first is more correct, but the second can be more helpful. The <- takes the String out of the IO String and stores it in str—but since we’re in an IO action, we’ll have to wrap it back up, so it can’t “escape”. The next line attempts to read an integer (reads) and grabs the first successful match (fst . head); this is all pure (no IO), so we give it a name with let no = …. We can then use both no and str in the …. We’ve thus stored impure data (from getLine into str) and pure data (let no = …).

This mechanism for working with IO is very powerful: it lets you separate the pure, algorithmic part of your program from the impure, user-interaction side, and enforce this at the type level. Your minimumSpanningTree function can’t possibly change something somewhere else in your code, or write a message to your user, and so on. It’s safe.

This is all you need to know to use IO in Haskell; if that’s all you want, you can stop here. But if you want to understand why that works, keep reading. (And note that this stuff will be specific to Haskell—other languages may choose a different implementation.)

So this probably seemed like a bit of a cheat, somehow adding impurity to pure Haskell. But it isn’t—it turns out that we can implement the IO type entirely within pure Haskell (as long as we’re given the RealWorld). The idea is this: an IO action IO type is the same as a function RealWorld -> (type, RealWorld), which takes the real world and returns both an object of type type and the modified RealWorld. We then define a couple functions so we can use this type without going insane:

return :: a -> IO a
return a = /rw -> (a,rw)

(>>=) :: IO a -> (a -> IO b) -> IO b
ioa >>= fn = /rw -> let (a,rw') = ioa rw in fn a rw'

The first one allows us to talk about IO actions which don’t do anything: return 3 is an IO action which doesn’t query the real world and just returns 3. The >>= operator, pronounced “bind”, allow us to run IO actions. It extracts the value from the IO action, passes it and the real world through the function, and returns the resulting IO action. Note that >>= enforces our rule that the results of IO actions never be allowed to escape.

We can then turn the above main into the following ordinary set of function applications:

main = getLine >>= /str -> let no = (fst . head $ reads str :: Integer) in ...

The Haskell runtime jump-starts main with the initial RealWorld, and we’re set! Everything’s pure, it just has a fancy syntax.

[Edit: As @Conal points out, this is not actually what Haskell uses to do IO. This model breaks if you add concurrency, or indeed any way for the world to change in the middle of an IO action, so it would be impossible for Haskell to use this model. It is accurate only for sequential computation. Thus, it may be that Haskell’s IO is a bit of a dodge; even if it isn’t, it’s certainly not quite this elegant. Per @Conal’s observation, see what Simon Peyton-Jones says in Tackling the Awkward Squad [pdf], section 3.1; he presents what might amount to an alternative model along these lines, but then drops it for its complexity and takes a different tack.]

Again, this explains (pretty much) how IO, and mutability in general, works in Haskell; if this is all you want to know, you can stop reading here. If you want one last dose of theory, keep reading—but remember, at this point, we’ve gone really far afield from your question!

So the one last thing: it turns out this structure—a parametric type with return and >>=— is very general; it’s called a monad, and the do notation, return, and >>= work with any of them. As you saw here, monads aren’t magical; all that’s magical is that do blocks turn into function calls. The RealWorld type is the only place we see any magic. Types like [], the list constructor, are also monads, and they have nothing to do with impure code.

You now know (almost) everything about the concept of a monad (except a few laws that must be satisfied and the formal mathematical definition), but you lack the intuition. There are a ridiculous number of monad tutorials online; I like this one, but you have options. However, this probably won’t help you; the only real way to get the intuition is via a combination of using them and reading a couple tutorials at the right time.

However, you don’t need that intuition to understand IO. Understanding monads in full generality is icing on the cake, but you can use IO right now. You could use it after I showed you the first main function. You can even treat IO code as though it was in an impure language! But remember that there’s an underlying functional representation: nobody’s cheating.

(PS: Sorry about the length. I went a little far afield.)

Plenty of good answers here, but they are long. I’m going to try to give a helpful short answer:

  • Functional languages put state in the same places that C does: in named variables and in objects allocated on the heap. The differences are that:

    • In a functional language, a “variable” gets its initial value when it comes into scope (through a function call or let-binding), and that value doesn’t change afterward. Similarly, an object allocated on the heap is immediately initialized with the values of all its fields, which don’t change thereafter.

    • “Changes of state” handled not by mutating existing variables or objects but by binding new variables or allocating new objects.

  • IO works by a trick. A side-effecting computation that produces a string is described by a function that takes a World as argument, and returns a pair containing the string and a new World. The World includes the contents of all disk drives, the history of every network packet ever sent or received, the color of each pixel on the screen, and stuff like that. The key to the trick is that access to the World is carefully restricted so that

    • No program can make a copy of the World (where would you put it?)

    • No program can throw away the World

    Using this trick makes it possible for there to be one, unique World whose state evolves over time. The language run-time system, which is not written in a functional language, implement a side-effecting computation by updating the unique World in place instead of returning a new one.

    This trick is beautifully explained by Simon Peyton Jones and Phil Wadler in their landmark paper “Imperative Functional Programming”.

The technique for handling state in Haskell is very straightforward. And you don’t need to understand monads to get a handle on it.

In a programming language with state, you typically have some value stored somewhere, some code executes, and then you have a new value stored. In imperative languages this state is just somewhere “in the background”. In a (pure) functional language you make this explicit, so you explicitly write the function that transforms the state.

So instead of having some state of type X, you write functions that map X to X. That’s it! You switch from thinking about state to thinking about what operations you want to perform on the state. You can then chain these functions together and combine them together in various ways to make entire programs. Of course you’re not limited to just mapping X to X. You can write functions to take various combinations of data as input and return various combinations at the end.

Monads are one tool, among many, to help organise this. But monads aren’t actually the solution to the problem. The solution is to think about state transformations instead of state.

This also works with I/O. In effect what happens is this: instead of getting input from the user with some direct equivalent of scanf, and storing it somewhere, you instead write a function to say what you’d do with the result of scanf if you had it, and then pass that function to the I/O API. That’s exactly what >>= does when you use the IO monad in Haskell. So you never need to store the result of any I/O anywhere – you just need to write code that says how you’d like to transform it.

I’m breaking off a comment reply to a new answer, to give more space:

I wrote:

As far as I can tell, this IO story (World -> (a,World)) is a myth when applied to Haskell, as that model explains only purely sequential computation, while Haskell’s IO type includes concurrency. By “purely sequential”, I mean that not even the world (universe) is allowed to change between the start and end of an imperative computation, other than due to that computation. For instance, while your computer is chugging away, your brain etc cannot. Concurrency can be handled by something more like World -> PowerSet [(a,World)], which allows for nondeterminism and interleaving.

Norman wrote:

@Conal: I think the IO story generalizes pretty nicely to nondeterminism and interleaving; if I remember right, there’s a pretty good explanation in the “Awkward Squad” paper. But I don’t know of a good paper that explains true parallelism clearly.

@Norman: Generalizes in what sense? I’m suggesting that the denotational model/explanation usually given, World -> (a,World), doesn’t match Haskell IO because it doesn’t account for nondeterminism and concurrency. There might be a more complex model that does fit, such as World -> PowerSet [(a,World)], but I don’t know whether such a model has been worked out and shown adequate & consistent. I personally doubt such a beast can be found, given that IO is populated by thousands of FFI-imported imperative API calls. And as such, IO is fulfilling its purpose:

Open problem: the IO monad has become Haskell’s sin- bin. (Whenever we don’t understand something, we toss it in the IO monad.)

(From Simon PJ’s POPL talk Wearing the hair shirt Wearing the hair shirt: a retrospective on Haskell.)

In Section 3.1 of Tackling the Awkward Squad, Simon points what doesn’t work about type IO a = World -> (a, World), including “The approach does not scale well when we add concurrency”. He then suggests a possible alternative model, and then abandons the attempt at denotational explanations, saying

However we will instead adopt an operational semantics, based on standard approaches to the semantics of process calculi.

This failure to find a precise & useful denotational model is at the root of why I see Haskell IO as a departure from the spirit and the deep benefits of what we call “functional programming”, or what Peter Landin more specifically named “denotative programming”. See comments here.