Spoilers Warning

Intro 

The Marvel Cinematic Universe (MCU) has grown to the point where it is the most successful movie franchise ever. Stories, legends, myths, and fictions form a key part of the human experience, and stories with this scale and reach are almost unprecedented. So digging into the mathematical structure of the collaborative universe is something I plan to do for a while. In this post we examine the structure of the MCU timeline (see above).

I talked some more about the Marvel Cinematic Universe (MCU) and why it’s interesting in a previous post. Have a read if you want to get started on this. Today I want to talk about a way of visualising the relationships between the characters and the movies in a timeline (shown above). It’s just a starting point – I plan few follow up posts. At the least, it’s a way you can track your favourite character, and plan a movie marathon to watch all of their movies in order.

I’ve been working on this for a while, so it’s possible you saw a previous version. My aim was to get this out on the release of Captain Marvel, and I thought I had it down. Then I saw the movie, which is brilliant, but required a few last minute alterations to the timeline, so slight delay here to get the final version up.

Visualising the MCU 

How do we visualise something as complex as the MCU? Well, there are a few other folks who’ve had a go:

These are classic network analyses. A network (maths peeps sometimes call it a graph) shows the relationships between a set of entities. These analyses are aiming to establish relationships between characters, or between movies. From those we can work out central characters, and so on. But I think you already know the central characters, more or less. Frankly, you can just count the number of movies each character is in.

Network analyses are great, and these ones have classier graphics than I do. But I want to show a little more than just who’s who. Time is a very important element of any narrative, but is sometimes overlooked in network views of the data. So, inspired by the map of Napolean’s Russian campaign by Minard (Tufte called Minard’s figure “one of the best statistical graphics ever” in Beautiful Evidence).

Minard's graphic of Napolean's losses in the Russian campaign.

Minard's graphic of Napolean's losses in the Russian campaign.

and the “Movie Narrative Charts” of XKCD,

https://xkcd.com/657/

https://xkcd.com/657/

I had a go at a timeline/flow diagram/narrative chart of the MCU (see above for the picture, and below for some details).

My Timeline of the MCU 

There’s a lot I want to say about this timeline. First some footnotes 1 2 3 4 5 6 7 . Go and read them before you get upset about mistakes. I’m not saying there aren’t any mistakes, just that I some things that look like mistakes are deliberate choices.

There are plenty of people doing timelines for the MCU, ranging from Marvel’s own coarse-grained timeline, to amazingly detailed fan sites. Some of these are almost at the level of telling us what Iron Man (Tony Stark) had for breakfast today. Here are links to a quite a few:

However, despite the title, I am not just doing another timeline. In XKCD terms it’s a narrative chart and I think of it as a flow diagram, and there are other names as well (see links at the end of this), but story timeline makes sense here. If we define a story as a sequence of events, and a narrative as how it is told, then I am trying to get at the story not the narrative. The pieces here are drawn in the order they happen (to my best ability) not the order they are shown (for instance the second Guardians movie appeared out of order compared to when it is set). So a (story) timeline seems the right term.

The goal of my timeline is to track each of the heroes through the movies they participate in. So we can see each character’s personal timeline. Each coloured line shows the path of one character (or sometimes a small group).

Beyond this the chart has a few other objectives:

  • Horizontal location is indicative of the time at which the movie was set2, though I had to take liberties in a few places to make it all fit. I tried to make my timeline consistent with as many of the timelines above as I can, but they disagree in places. Where possible I give precedence to Marvel’s official timeline, and for detailed placement take advice from Collider, which has the lots of detail, and is at least internally consistent. I’m suspicious that I should move Infinity War into 2018, to be consistent with Collider, but for the moment let’s stay with the official Marvel version where possible.

  • Sequences (direct sequels) for a particular hero are horizontally aligned, and non-sequences avoid alignment (as much as possible).

  • The field is vertically separated into human tech on the bottom, and magic, gods and space aliens on the top, though this is a coarse discriminator (the Tesseract in the 1st Captain America might seem to place it with the latter, but this doesn’t fit with the other placement criteria).

  • Node colours indicate the phase of the MCU. The phases are disjoint from the point of view of release dates of movies, but not so in terms of the timeline.

Ideally, charts like this are laid out by a clever computer program, but the programmer in this case was only just clever enough to do it manually. There are tools to draw these (see links below), but graph visualisation with as many constraints as I added (I’m also trying to reduce the number of edge crossings) isn’t easy at all. Set it up with so many goals and you almost certainly end up with an NP-hard problem, which is likely insoluble in any reasonable amount of time.

I do have some code/data that goes into this. This CSV file gives the information I compiled (by hand) from the existing timelines, and the table from my last post. The data is also shown below and you can get the CSV file here.

Code Phase Release year Setting year Setting month Title
a 1 2008 2010 October Iron Man
b 1 2010 2011 May Iron Man 2
c 1 2008 2011 May The Incredible Hulk
d 1 2011 2011 May Thor
e 1 2011 2012 April Captain America: The First Avenger
f 1 2012 2012 May The Avengers
g 2 2013 2012 December Iron Man 3
h 2 2013 2013 November Thor: The Dark World
i 2 2014 2014 April Captain America: The Winter Soldier
j 2 2014 2014 August Guardians of the Galaxy
k 3 2017 2014 October Guardians of the Galaxy Vol. 2
l 2 2015 2015 May Avengers: Age of Ultron
m 2 2015 2015 July Ant-Man
n 3 2016 2016 June Captain America: Civil War
o 3 2018 2016 June Black Panther
p 3 2017 2016 September Spider-Man: Homecoming
q 3 2016 2016 May Doctor Strange
r 3 2017 2017 June Thor: Ragnarok
s 3 2018 2017 July Avengers: Infinity War
t 3 2018 2017 July Ant-Man and the Wasp
u 4 2019 2018 Captain Marvel
v 4 2019 2018 Avengers: Endgame
w 4 2019 2018 Spider-Man: Far From Home
A 4 2019 Guardians of the Galaxy Vol. 3
B 4 2019 Black Panther 2
C 4 2019 Eternals
D 4 2020 Black Widow
E 4 2020 Captain Marvel 2

The “setting” year and month refer to my best estimate of where the movie should be placed in the timeline, given all of the caveats mentioned above, and the footnotes below.

Building the Picture 

The timeline (the picture above) was drawn manually using Inkscape, which is a brilliant tool for vector graphics. In theory I could automate drawing of these and I definitely plan to have a go at it, but let’s do that later.

In the last post I explained how to download data via Web APIs. I used the same approach, but in addition used IMDbPY to download cast data from IMDb (which doesn’t have such a nice API). The raw cast data has some problems: apart from a couple of spelling mistakes, there is no consistency on how to name any of the heroes. At the simplest level this is because heroes have alter egos, but even then, there are multiple variants used. To keep track of these I have a little file full of aliases. Here’s the first few entries:

Character Aliases
Aaron Davis Aaron Davis,Prowler
Abomination Abomination, The Abomination, Emil Blonsky, Blonsky
Abu Bakaar Abu Bakaar
Agent 13 Agent 13, Kate / Agent 13, Sharon Carter
Agent Garrett Agent Garrett, Jonathan 'John' Garrett
Agent Scott Kelly Agent Scott Kelly
Agent Stoltz Agent Stoltz,Stoltz

The list doesn’t include all named characters (yet), but has most that appear in more than one movie.

I use these aliases (via a Julia dictionary – see my last post) to map the cast names in IMDb to a consistent set of names. I mostly use the standard “superhero” name. This has limits (in the future it is possible that more than one person will take the roll of, for instance, Captain America), but it is acceptible for the current MCU.

There is one problem with this. You always need to spend time cleaning your data. In this case I went through it a few times, checking details. Most of the problems occurred because my alias list wasn’t complete, and these were easy to fix. But there are a few weirdnesses:

  • Black Widow appears in Ragnarok but is not listed in IMDb. Presumably that is because she appears only in a recorded video, not live, however the similar appearance of Captain America in Homecoming is listed.

  • They ret-conned Iron Man 2, saying that the kid in the Iron-Man mask is a younger Peter Parker, and again this does not appear in IMDb.

  • Phil Coulson is implicitly in The Incredible Hulk through the Marvel one-shot movie The Consultant, whish is essentially an extension of the end credits of The Incredible Hulk.

  • There is, in my list, an alias connecting JARVIS (Stark’s AI) to Vision because JARVIS becomes Vision. But in the final timeline, I chose to place Vision’s origin in Avengers: Age of Ultron, where it fits in the story.

The current code fixes these pieces semi-manually, and unfortunately, when we are dealing with real data, often there are exceptions that need a semi-manual fix like this.

After all this, we have a list of movies that each character participates in, and from these, we use the sequence listed above, or here to put the movies in sequence for each character, and then we have a path for each. The first few are listed below, and the file is here.

Character Titles Codes
Abomination The Incredible Hulk c
Agent 13 Captain America: The Winter Soldier, Captain America: Civil War in
Aldrich Killian Iron Man 3 g
Alexander Pierce Captain America: The Winter Soldier i
Ant-Man Ant-Man, Captain America: Civil War, Ant-Man and the Wasp, Avengers: Endgame mntv
Betty Ross The Incredible Hulk c
Black Panther Captain America: Civil War, Black Panther, Avengers: Infinity War, Avengers: Endgame nosv

The “codes” column is a mapping of the movies to the codes I gave in the sequence listing of the movies. It’s a key to allow sorting of the movies in sequential order, but the strings formed by listing the codes of the movies in the path will be useful to us when we start to learn more about these paths in future posts.

Note that the file has paths for many of the characters in the film, but I only included a selection of the most important heroes 7 in the timeline.

There is one hiccup still. The appearance of Captain America and The Winter Soldier in Ant-Man is in an end-credit scene. That is OK, and happens often in other movies. One of the cutenesses of the MCU is their clever use of these to lure fans forward. But that end-credit scene is (uniquely) actually a cut from the next Captain America film Civil War. Hence it is the only place in the timeline where we get a non-simple path, i.e., a path with a loop. Once we add that loop, the timeline is finished (except for endless fiddling to get it to look OK).

I know I’ve glossed over details here, in order to make this post a reasonable length, but I will go through more technical detail in a subsequent post. Stay tuned.

Summary 

I want to keep these post to a reasonable length, and this one was already over my target, so this is really just a description of the timeline. I have some ideas of what to do with it though, and you will see them in future posts.

In the mean time, enjoy the timeline. If it get’s popular, I aim to post a link to the raw SVG file, so people can mash with it. So do the usual social media stuff, if you care.


Links 

Other narrative charts

Generating charts

Other similar concepts

  • Sankey diagrams

  • Activity timelines

  • Event charts

  • Transaction charts


Footnotes:

  1. The timeline is based only on the movies (with small diversion for *Agents of Shield*, and taking into account the one shots). If the various graphic novels were taken into account there would be additions (for instance, *Fury's Big Week* links Nick Fury and Black Widow into *The Incredible Hulk*).
  2. The locations are intended to help place the movies in time. However, many of the movies cover multiple times: *Doctor Strange* takes place over approximately a 1 year period, and *Captain America: The First Avenger* takes place over some time during WWII. The choice of location in such cases is somewhat arbitrary. Additionally many of the movies have flashbacks, and some have flash forwards. The positions in the picture are chosen to help show the sequence of movies so we base the timing of each movie on its context within that sequence. That results in some (perhaps) unexpected locations. *Captain America* has only short sequences set in 2012, but it is these we use to place the movie, though we do make reference to the earlier component. Likewise Captain Marvel's timing is based on the only contemporary sequence, which is only a very brief part of the movie.
  3. We could argue about what it means to "be in" one of the movies. Some appearances are only in end credits (Nick Fury in *Iron Man*, or the Winter Soldier in *Black Panther*), or via a taped video (Captain America in *Spider-Man*, or Black Widow in *Ragnarok*), and in some cases these are not even credited. Our approach is to include an appearance if the character speaks, but not if it is an indirect reference (e.g., we exclude the reference to Spider-Man in *Ant-Man*). Other unusual instances: Bruce Banner is narrated to in *Iron Man 3*, and Peter Parker is ret-conned to be the kid in the Iron Man mask in *Iron Man 2*. We also include the one-shot movie "The Consultant" as an extended end-credit of The Incredible Hulk (it overlaps with the actual end-credit), and it places Coulson in that movie's timeline.
  4. Villains are not shown (except for those who change sides), but most major villains appear in only one movie and so would not appear in any case (Thanos is an obvious exception). This a departure from the graphic novel genre where villains often reappear.
  5. Most of the appearances create a well-ordered set of movies, i.e., we don't see something in one movie that overlaps a later movie. It's a little dicy during *Fury's Big Week* (*Iron Man 2*, *The Incredible Hulk* and *Thor* all happen almost in parallel), but the only place where we can't maintain a sequence at all is the pre-view of the *Winter Soldier* (and Cap and Falcon) in the end credits of *Ant-Man*. So in this one instance there is a loop in the paths. All others are "simple" paths.
  6. A superhero's hero-alter-ego might not exist in their early apperances (e.g., Bucky Barnes in *Captain America*). Normally, we ignore this and consider them to be largely the same person, plus or minus superhero powers. However, there is an exception: Stark's AI JARVIS who evolves into Vision. At least naively, JARVIS seems qualitatively different from Vision, so his origin is placed in *Age of Ultron*.
  7. Why these characters in particular. The goal was to track the superheroes (excluding villains for the most part). And by its nature it only sees characters that are in at least 2 movies. But there are borderline cases. Why not your favourite love interest -- Pepper Potts? Honestly I don't have a good reason except that it was starting to get just too complicated. I'll keep trying to integrate as many as I can.