Describing events in code
In my previous post I wrote about cataloguing 28 years worth of my visits to see movies, plays, gigs, exhibitions, etc. In this post I’m writing about some of the code structure, the decisions I made, and the changes I made as I went along. It might be useful or interesting to someone else, or to me to remember why some things are how they are. Maybe the thought processes are interesting to people who don’t write code themselves.
The code is an “app” (a modular chunk of code) for the Django framework, called django-spectator. It’s in two parts: one for recording what books and periodicals I’ve read (used here) and one for recording events I’ve been to (used here).
The reading part was built to replicate some old PHP code I’d been using for years so I had a clear idea of how it should work. The events part was new and I ended up changing my mind about its structure while writing it. And then, once it was “finished” and I started using it, I realised there were still more things I hadn’t done well and needed changing. I hope no one else has been using the code during this time.
The code itself isn’t terribly advanced but I still find some of this interesting — the compromises and balances required in representing real-world objects and events in code, in a way that’s useful and, to the end user (me), usable. The balance would be different in many cases if this code was for a large commercial site, and/or for different users, instead of only being for my personal website.
Rather than litter this write-up with code samples, I’ll link to the code on GitHub, using the current commit at the time of writing, for anyone who wants more detail.
§ The basics
There are some things that remained fairly consistent throughout my changes, so I’ll describe them first. In the Django code these are all models, each one representing a database table. Each model represents a “thing” in the world, often an object, but maybe a single concept (like Event
and EventRole
, below).
Creators
The Creator
model represents an individual or a group. e.g. “William Shakespeare” or “Belle and Sebastian”. Someone/people who wrote something, performed in something, directed something, etc. (On GitHub.) Each Creator
has a kind
field that indicates if it’s an individual or a group.
(A “field” represents a piece of information about a particular thing. e.g. Creator
s also have a field for name
. Each field is a column in a database table, like a column in a spreadsheet.)
If we were getting complicated we could have People and Groups as separate models, with a Group containing zero or more People. This might be useful if you went to see the group Ben Folds Five one year and then the next saw Ben Folds on his own — you might want to represent that they are linked, or that you had now seen Ben Folds twice, once as part of a group. Or you might want to record every member of a group you’d seen.
But this is more complex than is useful to me and so I conflated the two, which is good enough almost all the time. There are some odd things… Is “Nick Cave & The Bad Seeds” a group, or an individual and a group? I probably change my mind over this kind of thing from one day to the next, but it’s not a big deal.
Creator
s are shared between both the reading and events parts of the code. So I could see David Byrne in concert and read a book by him, and they’re both linked to the same Creator
object.
Creator
s have a single name
field, which avoids these problems with using firstname and surname and means it can also be used if the Creator
is a group. Because we might want to sort Creators-who-are-people by surname there’s also a separate field (on GitHub) that stores an automatically-generated “surname, lastname” style string in a manner that is good-enough-for-me but would, I imagine, contain a nightmare of edge cases in a real, international, application.
Venues
A Venue
is a place where an Event
happened. A cinema, theatre, ferry, a street, etc. (On GitHub.)
Venues have a country
field (because I wanted to see how many countries I’d been to events in), a latitude
and a longitude
(because I want to put these on maps), and an address
(which I lazily populate with some barely-useful information using a Google API and code similar to this). (On GitHub.)
Pretty simple. Too simple, as it turns out below.
Events
The Event
model records a visit to see something on a specific date at, optionally, a specific Venue
. (On GitHub.) A Venue
is optional for an Event
because sometimes I knew when I’d been to something but I had no record or memory of where it was.
I can imagine adding an optional time
field in future, if only so that multiple Event
s on the same day can be displayed in the correct order.
An Event
can be one of several kind
s: Cinema, Classical Concert, Comedy, Dance, Exhibition, Gig, Theatre or Other. This isn’t very flexibly done and, having input all my data, I now want to add “Talk”. Other people will have other requirements. Still, this lets us display particular kind
s of Event
grouped together, which feels important.
§ Events vs Works
That’s all good but we’ve yet to link Creator
s to Event
s. I realised there are two different ways this can be done.
The simplest, is for zero or more Creator
s to be linked directly to an Event
. For example, if you go to see The Mountain Goats play live, you’re actually seeing The Mountain Goats right there, at the event. That’s easy. We can link their Creator
object to this particular Event
object.
The only addition to this is that rather than linking Creator
s to Event
s directly, I’m using a “through model” (EventRole
) that describes the link itself — how it should be described and what order it should be displayed in (on GitHub). So we can say that at this particular Event, The Mountain Goats were headlining, and so should be listed first, while Emmy the Great should be described as “Support” and appear down the bill.
Movies
However… if you go to see the movie Dunkirk you didn’t actually see the director Christopher Nolan or any of the actors, all of whom you might want to record in the event’s data, depending on your diligence/masochism. You could call the Event
itself “Dunkirk” and add Christopher Nolan to it, as we did with The Mountain Goats. But what if you go to see Dunkirk again? You could do the same with a second Event
but there’s no nice way of grouping both visits together other than by finding both Event
s with name of “Dunkirk”. Which is OK, so long as you never see a play called “Dunkirk”, or that 1958 film of the same name.
So it seems we need a separate Movie
model. We could link Creator
s (like Christopher Nolan) to this. And then link a Movie
to an Event
. So when you create a new Event
object you can say “it featured this Dunkirk Movie
“. We could then see all the Event
s connected to that one Movie
. And we could list all the Christopher Nolan Movie
s we’ve seen.
We’ll also add a MovieRole
“through model”, like we did with Event
s, to indicate that, in this case, Nolan was the “Director” and should be listed ahead of anyone else. Very good.
Plays
Now, what about plays? Maybe we could do similar here and have a Play
model, with Creator
s linked directly to that. This doesn’t quite work though. If you go to see King Lear you’d make a new Play
object and assign William Shakespeare to it as “Playwright”. And this was a production by the Royal Shakespeare Company so you could add them as a Creator
too. And then credit Anthony Sher as playing Lear. It’s all looking good…
…until a few years later and you go to see King Lear performed by a local school, so you create a new Event
, and add your “King Lear” Play
to it… and realise that while it correctly lists Shakespeare as the playwright, it also lists the RSC and Anthony Sher, neither of whom were at this performance in a school hall. Ah.
So maybe the Performance
of a Play
is a separate thing? The Event
would have a Performance
linked to it (with the RSC and Sher attached) and this would have the “King Lear” Play
linked to it (with Shakespeare attached).
A chain of Event
> Performance
> Play
. This seems to work.
But then… what if you thought Sher was so good that you went to see the RSC’s production several times? It doesn’t seem quite accurate enough to list all these Performance
s separately with no real relationship between them. So maybe we need a Production
! A Production
could have several Performance
s of the same Play
. The Production
would have the RSC attached while the Performance
s would have Sher attached, or his understudy if he was ill.
We’d have Event
> Performance
> Production
> Play
.
This seems reasonably accurate. It could be “better” though… what if you saw a version of King Lear that was adapted by someone else? Maybe you saw it translated into another language. So we’d need to have a way of having “versions” of the same Play
, each with additional roles (e.g. “Translator”).
So this could be improved, and made more detailed, but… No, stop! It’s already too much you fool!
If this was a larger website, with lots more data, maybe this would be required. But for my purposes it’s too complicated, and it gets too fiddly for inputting the data. I did start off with the Event
> Performance
> Play
structure but that was too much for my needs and so I simplified things.
I went back to the start of this process and ended up with a Play
(King Lear by William Shakespeare) which can be linked to an Event
(that features the RSC and, if I entered him, Anthony Sher).
It’s not perfect, but it works OK for my needs. A compromise in favour of simplicity and ease-of-use.
Classical works and dance pieces
I don’t see many of these but I think I started off with a similar structure to Play
s, having performances of particular works (like Music for 18 Musicians). While this would be required for some websites (or still be too simple) it was overkill for me. So I simplified it to having a ClassicalWork
(e.g. with “Composer” Steve Reich) linked directly to an Event
(with the performers attached to that). And similar for DancePiece
s.
More simplifying
After simplifying plays a bit, this is where I was: An Event
could have one or more of these things attached to it, depending on whether it had a kind
of “Cinema”, “Theatre”, “Classical Concert” or “Dance”:
Movie
Play
ClassicalWork
DancePiece
Each of those has its own “through model” lining Creators to it: MovieRole
, PlayRole
, ClassicalWorkRole
, DancePieceRole
.
This worked fine, for my needs. But I realised it was still unnecessarily restrictive and fiddly for two reasons.
First, what if I went to an Event
that featured a performance of a classical work and then a piece of dance? In my system an Event
of a particular kind
could only have the matching type of work added to it. It seemed logical at first but it was an unnecessary restriction. So I removed it. Now, the Event
kind
is more of a guide as to how we could split Event
s up when displaying them. We can list “Cinema” Event
s separately from “Theatre” Event
s. And a “Cinema” Event
could feature a DancePiece
as well as a Movie
. That’s fine. No one will die.
Second, having simplified how I represented plays, classical works and dance pieces (removing the Performance
model between them and Event
s) they were all pretty similar to each other, and to Movie
s. It now seemed overly-complicated in the website’s admin pages to list forms for these different models separately. So I eventually ended up conflating them all into a single Work
model, which has its own kind
field (“Classical work”, “Dance piece”, “Movie” or “Play”). And a single WorkRole
“through model” associating Creator
s to it. (On GitHub.)
This is much simpler. We have an Event
which can optionally have Creator
s associated directly with it — people or groups who were actually there (e.g. musicians at a gig, actors at a play, film directors at an after-movie-discussion, etc.). And then the Event
can optionally have a variety of Work
s linked to it, each of which can optionally have Creator
s associated with them (e.g. playwright, director, etc.).
Looking back, it feels silly that I made things so complicated to start off with. This is the kind of rabbit hole you can end up going down when you want to accurately model the world. The real world is complex and often doesn’t map well to the more binary, logical world of code. It’s easy to over-do this process (particularly when working on your own!) and end up with a more “accurate” but unwieldy mess that’s hard to maintain, hard to work with, and unnecessarily fiddly for the end user to use. And it’s never quite right.
The new structure is certainly good enough for my needs, and it’s relatively simple to enter data. Finally.
Oh, but what about if you go to see a movie that’s based on a play? Isn’t that like a “performance” of a play? Or, what about if you see a filmed version of a live performance of a play? What even is that? A play? A movie? What is the Work
there?
It’s never quite right. Compromise.
§ Venues
I mentioned Venue
s earlier and they seem pretty simple. A name and a location. A place where Event
s happen.
Changing names
However, as soon as I started entering data from 28 years ago into my site I realised the biggest difficulty… Venues change over time. I’m not sure why I didn’t think of this initially.
If you’re entering data about a visit to see a movie at the MGM Trocadero in 1995 you want it to appear on the site as being at the MGM Trocadero. But later, when you enter data about seeing a movie at the same cinema, that’s now known as the UGC Shaftesbury Avenue, you want that visit to say “UGC Shaftesbury Avenue”. And you don’t want the previous visit to change from being at “MGM Trocadero”. So, they have the different names, but they feel very much like the same Venue
.
There is a slightly philosophical question here… What does it take to change one venue into an entirely new venue? Some options:
- When it changes name and branding. e.g. from “MGM Trocadero” to “Virgin Trocadero”.
- When it keeps the same name but has its interior reconstructed. e.g. it splits one screen into two or three.
- When it changes name and branding and has its interior reconstructed. e.g. Cineworld Shaftesbury Avenue becoming Picturehouse Central.
- When it’s roughly the same location, with the same company, but in an entirely different building. e.g. the RSC performed in the temporary Courtyard Theatre (on the site of The Other Place) for a few years while its nearby theatres were redeveloped. That was then replaced by a new The Other Place on the same spot.
- When it temporarily moves to a new location. e.g. the Almeida theatre “moved” to near King’s Cross while its permanent building was being renovated.
- When it permanently moves to a new location. e.g. The Odeon in Colchester was replaced with a new Odeon Colchester round the corner from the previous version in 2002.
I’ve tended to treat the first three of these cases as being the same venue over time. Even the radical transformation of the Trocadero cinema into Picturehouse Central feels, just about, like going to the same venue. On the other hand options 4-6 feel like they create separate venues.
So, given options 1-3, we need a way to keep track of a Venue
‘s different names over time. I didn’t think of that when I started.
The “proper” way to do this would be, I think, to have a separate model like VenueName
which would have fields for name
, start_date
, end_date
and a link to a Venue
object. Any Event
that occurred at a Venue
would be displayed using the VenueName
related to the date.
However, this requires knowing exactly, or even roughly, when a venue changed its name, and adds to the complexity of data entry. This is one of those things it’d be worth doing for a bigger site, with more users, that had to be more robust. But for me it seemed like overkill.
I ended up with another compromise. Each Event
has a field for venue_name
. When creating a new Event
the current name of the linked Venue
is copied to that field. So, assuming Event
s are entered in chronological order you can change the name of a Venue
as you work forward through time, and each Event
will bear the “current” name. No matter how often you change a Venue
‘s name, the venue_name
saved with existing Event
s won’t change.
The only downside is that if you need to change the historical venue name for several Event
s in the past, it requires manually editing all of their venue_name
s, rather than only editing (or creating) a single VenueName
object with the relevant dates. But, again, it’s Good Enough for my needs, and keeps things fairly simple. You can see the name changing over time on the Picturehouse Central page.
Screens and theatres
There’s yet another question about what a Venue
is… is it an entire building? Or a particular screen, theatre, hall, etc. within the building?
For example, is the National Theatre a single venue or are its Olivier, Lyttleton and Dorfman Theatres separate venues? This seems like more of an issue for theatres than cinemas — a performance in the Barbican’s 1,156 seat Theatre is a very different experience to one in its 200 seat Pit. But you could say similar for cinemas — you may want to record whether you saw a film in a cinema’s giant main screen or its poky 30-seater.
And, close to home for me, is the Barbican’s Cinema One (large, in one building) the same venue as the Barbican’s Cinemas Two and Three, which are in a connected-but-different building up the road? They have the same branding and website but feel quite different. And then are those Cinemas Two and Three a different venue to the Cinemas Two and Three that were in a different Barbican building again until they closed a few years ago?
I don’t have good answers for any of these. It seems even less clear-cut than the “When does a venue become a new venue?” question. Sometimes I’ve made separate Venue
s for things like this, other times I’ve created one Venue
and noted on each Event
which theatre it was in (e.g.).
§ Conclusion
That’s how I got to where I am with all this. The site works fine and I love having all that historical data in it. The compromises I’ve reached work for me and aren’t too annoying. For other uses there would be different compromises required.
There are still a few things that, inevitably, aren’t quite right, aside from everything mentioned above:
The “reading” and “events” parts of django-spectator are separate, only sharing the concept of
Creator
s. Which means that if I read the print version of King Lear and then see the play performed, these are separate things in the database: aPublication
and aWork
(play). It’d be nice for these to be the same thing, or related somehow.If I go to see several short plays at one
Event
, I can only attachCreator
s to either each individualPlay
or to theEvent
as a whole. There’s no way to indicate whichPlay
they were acting in, or directed. (e.g.) I’d need the more complicatedEvent
>Performance
>Play
structure I retreated from above.Another part of my site records every time I listen to a piece of music, via Last.fm using my django-ditto code. It would be nice if a
Creator
I’ve seen live could be linked to the occasions I’ve listened to the music.If I go an art exhibition this is currently an
Event
with a title and some optionalCreator
s attached. And so if I go to the same exhibition multiple times there’s no direct relation between these visits. Maybe an exhibition should be a newkind
ofWork
with its ownCreator
s (e.g. “Artist”, “Curator”, etc.)?
These are all edge cases — there are always edge cases — and they don’t bother me too much. Yet. But they’re the kinds of things I’d bear in mind if I was making something like this for different people with their own uses and needs.
As it is this is all working, and full of data, and is very pleasing.