Architecture

Services

This is a list of the docker containers for ListenBrainz services used in local development and running in the MetaBrainz server infrastructure.

In production, webservers run uwsgi server to serve the flask application. In development, the flask development server is used.

Webservers
Development	Production	Description
web	listenbrainz-web-prod	serves the ListenBrainz flask app for the website and APIs (except compat APIs).
api_compat	listenbrainz-api-compat-prod	serves a flask app for only Last.fm compatible APIs.
websockets	listenbrainz-websockets-prod	runs websockets server to handle realtime listen and playlist updates.

Databases and Cache
Development	Production	Description
redis	listenbrainz-redis	redis instance used for caching all stuff ListenBrainz.
lb_db	listenbrainz-timescale	timescale instance for ListenBrainz to store listens and playlists. in development environment, the all databases are part of lb_db container.
lb_db	postgres-floyd	primary database instance shared by multiple MetaBrainz projects. The main ListenBrainz DB resides here as well as the MessyBrainz DB.

Misc Services
Development	Production	Description
timescale_writer	listenbrainz-timescale-writer-prod	runs timescale writer which consumes listens from incoming rabbitmq queue, performs a messybrainz lookup and inserts listens in the database.
spotify_reader	listenbrainz-spotify-reader-prod	runs a service for importing listens from spotify API and submitting to rabbitmq.
spark_reader	listenbrainz-spark-reader-prod	processes incoming results from spark cluster like inserting statistics in database etc.
rabbitmq	rabbitmq-clash	rabbitmq instance shared by MetaBrainz services. listenbrainz queues are under /listenbrainz vhost.

Only Production Services
Production	Description
listenbrainz-labs-api-prod	serves a flask app for experimental ListenBrainz APIs
listenbrainz-api-compat-nginx-prod	runs a nginx container for the compat API that exposes this service on a local IP, not through gateways.
listenbrainz-cron-prod	runs cron jobs used to execute periodic tasks like creating non-full dumps, invoking spark jobs to import dumps, requesting statistics and so on.
listenbrainz-full-dumps-cron-prod	runs the twice-monthly full dump cron job.
exim-relay-listenbrainz.org	smtp relay used by LB to send emails.
listenbrainz-typesense	typesense (typo robust search) used by the mbid-mapping.
listenbrainz-mbid-mapping	A cron container that fires off periodic MBID data processing tasks.
listenbrainz-mbid-mapping-writer-prod	Maps incoming listens to the MBID mapping as well as updating the mapping.
listenbrainz spark cluster	spark cluster to generate statistics and recommendations for LB.

Listen Flow

Listens can be submitted to ListenBrainz using native ListenBrainz API, Last.fm compatible API (API compat) and AudioScrobbler 1.2 compatible API (API compat deprecated). Each api endpoint validates the listens submitted through it and sends the listens to a RabbitMQ queue based on listen type. Playing Now listens are sent to the Playing Now queue, and permanent listens are sent to the Incoming queue.

Playing now listens are ephemeral are only stored in Redis, with an expiry time of the duration of the track (if duration is unavailable then a configurable fallback time is used). The Playing now queue is consumed by Websockets service. The frontend connects with the Websockets service to display listens on the website without manually reloading the page. You can get a recording_msid for the purpose of submitting love/hate feedback by submitting a playing_now listen with the parameter return_msid set to :json:`true`. See the submit-listens example for more details.

On the other hand, “Permanent” Listens need to be persisted in the database. Timescale Writer service consumes from the Incoming queue. It begins with querying the MessyBrainz database for MessyBrainz IDs. MessyBrainz tries to find an existing match for the hash of the listen in the database. If one exists, it is returned otherwise it inserts the hash and data into the database and returns a new MessyBrainz ID.

Once the writer receives MSIDs from MessyBrainz, the MSID is added to the track metadata and the listen is inserted in the listen table. The insert deduplicates listens based on a (user, timestamp, track_name) triplet i.e. at a given timestamp, a user can have a track entry only once. As you can see, listens of different tracks at the same timestamp are allowed for a user. The database returns the “unique” listens to the writer which publishes those to Unique queue.

The Websockets server consumes from the unique queue and sends a list of tracks to connected clients (like the now playing queue). The MBID mapper also consumes from the unique queue and builds a MSID->MBID mapping using these listens.

Frontend Rendering

ListenBrainz frontend pages are a blend of Jinja2 templates (Python) and React components (Javascript). The Jinja2 templates used are bare bones , they include a placeholder div called react-container into which the react components are rendered. To render the components, some data like current user info, api url etc are needed. These are injected as json into two script tags in the HTML page, to be consumed by the React application: page-react-props and global-react-props.

Most ListenBrainz pages will have a Jinja2 template and at least 1 React component file. The components are written in Typescript, and we use Webpack to transpile them to javascript, to compile CSS from LESS and to minify and bundle everything. In local development, this is all done in a separate Docker container static_builder which watches for changes in front-end files and recompiles automatically. In production, the compilation happens only once and at time of building the docker image.

Using script tags, we manually specify the appropriate compiled javascript file to include on a given page in its Jinja2 template.