← Home About Archive Photos Also on Micro.blog
  • Speeding up the firehose

    As the final phase of my database experiments, I’ve been playing recently with optimizing the speed of processing in the firehose process.

    The Bluesky firehose is a streaming API that sends you every single thing happening on the network in real time – every post, like, follow, everything (encoded in a binary format called CBOR). And my firehose process is a client connecting to that stream, pulling data from it and saving the relevant parts (the posts).

    At the moment, the firehose flow ranges between ~300 and 700 events per second during the day (“event” being one “action” like created or deleted record of any kind), with the bottom around 7-9:00 UTC and the top around 17-22:00 UTC. But the highest bandwidth we’ve seen so far was around 1.8-2.0k events/s in mid November, when a massive new wave of new users joined in the aftermath of the US elections, and we’ve seen the traffic go up rapidly by a few hundred % within days before too, e.g. in early September when the whole Brazil joined Bluesky. The firehose protocol also includes a lot of extra data, needed for cryptographic verification and full replication of data repositories, but not used in a simple case like this – so the total data bandwidth to stream and process is an average ~20-30 Mbps right now (with the smaller traffic than before).

    My server was struggling back then in November, processing the 1.8k events per second, which is a big part of the reason why I started looking for a better suited database than SQLite, to be able to handle the next wave. I had a few improvements lined up, which I now started applying one by one (trying to do one at a time, so I can measure which one does what difference), each time pausing the reading for a day or so, and then catching up for several hours at full speed, so I can check what’s the maximum speed it can do.

    I was also doing very simple profiling on the way, measuring which parts take how much time and where I can still save some extra microseconds:

    Overall, I managed to bring the number from around 2k evt/s to around 4k over a period of about a week (similarly in the MySQL and Postgres versions) – so not bad at all.

    These are the changes I’ve made:


    Partial indexes

    I’ve switched two indexes in Postgres to partial indexes (MySQL doesn’t support that). I have two indexes in the posts table, on quote_id and thread_id, which set a reference to a quoted post and the root of the thread respectively, which are NULL for some of the posts (thread_id for like half of the posts, quote_id for the vast majority), so I added a WHERE ... IS NOT NULL condition to those two.

    Effect: no noticeable change in performance, but it least the indexes take less space now.


    ActiveRecord

    I updated ActiveRecord – from v. 6.1 that I had before to 7.2 (Rails 8 was released in November, but it feels a bit too fresh right now). Ironically, AR 8.0 seems to finally fix my longstanding problem with concurrency errors in SQLite, just when I’m getting rid of it…

    Effect: around +10% processing speed.


    Ruby update + YJIT

    I updated Ruby from 3.2 to 3.4 and enabled YJIT, a new optional JIT written in Rust. I had tested YJIT before on 3.2 and there was a noticeable improvement, but I’ve noticed some kind of memory leak then that they’ve apparently fixed only around 3.3.1. Interestingly, my code seems to run slightly slower on 3.3 and 3.4 without YJIT, but enabling it more than makes up for it.

    Effect: around +15%.


    Batch inserts

    I was already previously grouping posts into a queue, waiting until I got 10, and then saving them all in one transaction, using code like this:

    ActiveRecord::Base.transaction do
      @post_queue.each do |p|
        p.save!
      end
    end
    

    This was already faster than the initial way of saving of each post separately, directly in the block where it’s built from the processed event. But that was still multiple INSERTs, and I wasn’t sure I could do it in just one while using the ORM.

    Apparently I can, with the use of the insert_all API. The catch is that it takes an array of raw argument hashes, skips validations and doesn’t handle child records (my Post records have linked FeedPost entries if they are included in a feed). So I validate the records separately earlier (I was actually already validating them first before, which means the validations were running twice), and for those that have FeedPosts, I save them separately next.

    So the whole thing looks like this:

    def process_post(msg)
      post = Post.new(...)
    
      if !post.valid?
        log "Error: post is invalid: ..."
        return
      end
    
      @post_queue << post
    
      if @post_queue.length >= POSTS_BATCH_SIZE
        save_queued_posts
      end
    end
    
    def save_queued_posts
      matched, unmatched = @post_queue.partition { |x| !x.feed_posts.empty? }
    
      if unmatched.length > 0
        values = unmatched.map { |p| p.attributes.except('id') }
        Post.insert_all(values)
      end
    
      @post_queue = matched
      return if @post_queue.empty?
    
      ActiveRecord::Base.transaction do
        @post_queue.each do |p|
          p.save!(validate: false)
        end
      end
    
      @post_queue = []
    rescue StandardError => e
      # ... try to save valid posts one by one
    end
    

    Effect: up to around +30% in the Postgres version, less in MySQL one.


    Rust native module for regex matching

    Last year I wrote a little native module in Rust (which I didn’t know at all before) to speed up matching of post contents with a large number of regexps for the feeds:

    I played with a benchmark of some regexp matching code in several languages last week (I wanted to replace that part of Ruby code with something faster, maybe Crystal or Swift). ChatGPT helped me write some versions :)

    The results were… unexpected. Swift & Crystal didn’t do well, but JS & PHP did 🤔

    [image or embed]

    — Kuba Suder 🇵🇱🇺🇦 (@mackuba.eu) April 16, 2024 at 7:34 PM

    I haven’t enabled it on production yet though, because there are some edge cases I need to get fixed, and I only used it for manual feed rebuilding so far. So as the next step, I enabled that Rust module in the firehose process. (I’d love to eventually make that available as a gem you can easily add to your app.)

    Effect: +15%.


    Asynchronous transaction commits

    Next, I turned on two options: synchronous_commit = off in Postgres, and innodb-flush-log-at-trx-commit = 0 in MySQL. These basically make it so that when you commit a transaction, you don’t wait until it’s synced to disk first, but the request returns just leaving the data in memory, and the data is synced to disk later (but like, up to a second later). So if the whole instance hard-reboots for some reason, you can lose maybe one second of data (but it wouldn’t corrupt the database, since the transaction is either saved or not).

    This is possibly still dangerous if you’re saving user’s private data from a webapp, but in this case, I’m streaming public data that I can scroll back through a bit if needed. So in the highly unlikely case that I lose a second of data somehow, I just replay a very slightly older cursor.

    Effect: no noticeable improvement… probably because with the batched inserts, it was already doing just a few inserts per second, so there wasn’t any more time to save here.


    Optimizing JSON handling and Time parsing

    I had a place in the processing function where it encodes a record parsed from CBOR into JSON, to be saved into a data column and for logging purposes, but for complicated reasons that JSON.generate was called twice with slightly different versions of record. So I refactored that to let it skip one encode.

    There was also some time spent parsing the record time and the event time into Time objects from an ISO8601 string using Time.parse; but I had a feeling that this could probably be speeded up using some native code, especially if it only accepts ISO8601 formats, and only in forms that actually appear in the data (since Time.parse handles many, many possible formats). With the help of Mr. GPT I made a little C module with a single function to replace Time.parse, and that indeed ended up being faster. (I’ll also try to turn that into a micro-gem.)

    Effect: +8% (this finally pushed it over the 4k level)


    Jetstream

    Finally, just to check for now, I’ve switched from the full CBOR firehose to Jetstream, which is a kind of pre-processed stream that strips all the signatures and Merkle tree stuff and outputs a simplified JSON stream. I’ve already added support for it to my firehose library Skyfall in October, but I’m not using it everywhere yet.

    Effect: +30-40%, although the speed was suspiciously capped at almost precisely ~4989 evt/s on both servers… so I’m guessing I might have hit some kind of rate limit of the Jetstream server. But I could also host it myself if needed.

    (Edit: yes, apparently 5k is the default rate limit configured in Jetstream server settings.)


    Potential ceiling

    I also did a test with all processing completely skipped, so the app would just read the data packets from the websocket and discard them, not even decode the CBOR. This did around 6k evt/s with the full firehose (possibly more for Jetstream, but I couldn’t test this) – so it looks like this is the max I can do without switching to something completely different. The library I’m using for websocket connection uses the very old eventmachine library underneath, and I was told that the newer async-websocket that uses the Async framework could be faster, but in brief tests I wasn’t able to see any improvement so far.

    Finally, just to see what the limit would be if I were to abandon Ruby (at least for that part), I ran a quick test in Rust, specifically using the example from the Atrium library. It did… 25k evt/s 😱 So yeah, there are still more options ;)

    That said, my current VPS will max out the allowed traffic at a sustained 300 Mbps, which if I’m counting correctly would be something like 7.8k evt/s from the full firehose, so above that level this would be the main problem.

    → 6:26 PM, Mar 18
  • MacBook Air vs. Pro

    I’m a very picky person when it comes to buying things… especially Apple hardware. I can’t count the number of Apple devices I tried out and sent back, because something wasn’t right. Because there are always tradeoffs, and I don’t like tradeoffs, I want it to be just perfect 😅

    So since I really needed a new MacBook (I’ve been using an M1, which would be fine if I had only bought it with more than 8 GB RAM… 🤦🏻‍♂️), last autumn I decided to test MacBooks this way. I wasn’t sure if I wanted an Air or a Pro, since they’re quite different right now. There’s this holiday promotion that Apple has each year around Black Friday, that if you buy something in November-December, you can safely return it until the 2nd week of January (!) instead of the usual 2-week period. So I decided to buy both an M3 13" Air and an M4 14" Pro, try them out and pick one to keep in January 🙃

    Spoiler: I’ve kept the Air in the end. The Pro has some things better than the Air, but it’s noticeably thicker and heavier, and using the classic MacBook Airs for ~5 years in the last decade has spoiled me forever, and I just refuse to go back to heavier, bulkier laptops.

    There are a few more differences that I wrote down, so here’s a short subjective comparison if you’re just looking at those shiny new Sky Blue Airs and thinking if you should get that one or the Pro…

    Note: I haven’t really used the Pro that much, certainly not all the time for 2 months – more like a few times for a few hours. I’m writing on the Air right now.

    Size / exterior

    As mentioned, there is definitely a difference. For comparison:

    • current MacBook Air 13.6" (2025): 1.24 kg
    • Retina MacBook Air 13.3" (2020): 1.29 kg
    • classic MacBook Air 13.3" (2015): 1.35 kg
    • older MacBook Pro 13.3" (2020): 1.40 kg
    • current MacBook Pro 14.2" (2024): 1.55 kg (M4) to 1.62 kg (M4 Max)

    So the lightest Pro is 150g heavier than my previous M1, 200g heavier than classic MacBook Airs, and 300g+ heavier than the current Air (!). That’s about 2/3 of the iPad’s weight of difference, and it’s absolutely noticeable. It might not be a problem if you use it as a desktop computer, but it is a problem if you need to carry it around all the time, or if you have the “on the lap lying on the couch” kind of workplace like me…

    The Pro is also a lot thicker (and at least feels thicker than the 13.3" Pro). It’s not that much of a problem, but I definitely like how super thin the Air is.

    I remember the Pro also had those kinda annoying slots with a bit sharp edges at the bottom, for air in/outflow probably, Air doesn’t have those.

    Screen

    The Pro has a slightly bigger screen, not that much though.

    ProMotion (120 Hz) on the Pro is immediately noticeable, but I can’t say if it’s important or even clearly better, it’s a bit weird at first (feels kinda… iOS-y?); probably something I could get used to rather quickly though. I can’t say I’m missing it on the Air at all, but I’m not that used to this mode, since the only devices with it that I owned was my previous and current iPad.

    The Pro has a better DPI – the screen has a “full 2x Retina” resolution (1512 x 982 “point” size), while the Air uses a scaled resolution (1280 x 832 “point” size, while the standard display resolution is 1470 x 956), so the Pro’s screen is sharper. Again, the past two MacBooks I’ve used also used the scaled resolution, so it’s not something that’s bothering me.

    Pro has an HDR mode with higher brightness – I’ve tested this playing “Silo” on Apple TV, but… I had a bit mixed feelings about it. I’m not sure if it looked better on the Pro with the higher contrast and real black, than on the Air with less black blacks. And the white interface elements that show up on hover were kinda blinding.

    The Pro is technically capable of higher peak SDR brightness, but it’s hard to say how this works exactly – the specs say “up to 1000 nits (outdoor)” (vs. Air’s 500 nits). But note the “outdoor” – so it’s something that uses the light sensor, and only enables higher max limit if it’s in a bright place. So if you took both of them outside on a bright day, the Pro would probably be much more usable, but in a rather dark room, they looked pretty much the same at max brightness. But there might be some hacks to unlock the limit (?). Anyway, as someone who usually has max brightness set, and has sent back one Air in the past because it was too dark (400 nits), the current Air is good enough for me.

    I was a bit worried about the screen color balance – Apple has this annoying thing where every device has a slightly different white tone, sometimes it’s more yellowish, sometimes more blueish, sometimes more towards violet…. and it’s not always possible/easy to calibrate it differently. Some of the past MacBook Airs and the new OLED iPads are too yellowish for me, for example.

    So these two… are both kinda yellowish, the Pro even more so I think. But they can be calibrated a bit – the Pro has this more advanced color calibration thingie that I’ve never seen before, that’s missing on the Air, which lets you change color balance on a hexagonal grid. The Air has a standard color profile wizard with a white point slider. But I’ve moved it slightly towards the blue and it’s ok. If you’re used to Apple’s current OLED screens (I’m not), you won’t notice this.

    Notch

    I was horrified when they first announced the MacBook with a notch in the screen… I was expecting this to be a big problem. It’s not really a problem in practice, and I say this as a notch-hater. I know there’s some hack that lets you run a smaller resolution, pushing the menu bar down and making the screen fully rectangular, at the cost of lost vertical space, but I don’t think I’d want that – those pixels definitely count. The only thing I don’t like is that the notch swallows overflowing icons from the menu bar, they just disappear if you have too many. I was recommended several tools that work around this in that thread, but I haven’t tried any yet.

    Keyboard

    I think both have fairly similar keyboards – the good one that Apple switched back to from the terrible butterfly one. They both felt ok to write on.

    However, they do look different – the Air has the classic black keys on silver background design, while the Pro has black keys on… black. I’m not a fan of this, tbh – the keys are much harder to see. They do have the backlight of course, but you kinda need to rely only on that.

    Ports

    The Pro has an HDMI port and an SD card slot. I couldn’t care less tbh – I use these very rarely, and I got adapters. I prefer a thinner laptop instead.

    The bigger difference is the extra USB-C port on the right, while the Air only has ports on the left – that would be kinda useful, but oh well.

    The Pro also has the headphone jack on the left instead of right, but I’m not sure if that matters. It might be better this way tbh – I think they used to be on the left long ago, and I liked that more?…

    Temperature

    The Air definitely runs a bit warmer at idle than the M1 MBP did (doesn’t really fall below 40°C when it’s running), and gets warm quicker when doing something serious, and I don’t really like this… or even if the temperature of the CPU is the same, it feels warmer on the bottom, which matters when you’re mostly holding it on the lap. I guess it might be because of the fan-less internal design, or just things are arranged differently inside somehow. Although I think the Pro was also slightly warmer than my M1. (I’m curious how the new M4 Air will turn out in complete reviews…)

    Sound

    The Pro had a better sound on the speakers – not that the Air sounds bad, but the Pro definitely sounded better somehow (note, I’m not an audiophile at all).

    And surprisingly, the Air has a different sounding touchpad. I think the click sound is the same here as on the 2020 MacBook Air, and different than the current and last few generations of Pros – the Pros have a kinda more “metallic” click sound, and the Airs' touchpad click is… I don’t know how to describe it, it’s different. A slightly lower, more “dull” tone. I think I like the Air version more.

    → 7:05 PM, Mar 6
  • Year 2024 review

    2024 was a pretty intense year for me in terms of coding… and a pretty not-intense one for anything else 🫣 I was so completely absorbed with building my Bluesky-related projects. It gives me a lot of satisfaction to be able to work on things which are based on my own ideas and needs, which give me fun problems to solve, let me learn a lot, and which seem to be helping a lot of other people.

    I figured I could look back at this past year and make a summary of what I managed to build and do, since it’s been a while since I’ve done one of those. Yes, I know we’re well into February already, I’ve been procrastinating, sue me. (Also, we’ve just had an anniversary of Bluesky public launch a few days ago, so that seems like a good excuse too.)


    Bluefeeds

    Bluefeeds is my internal codename for the Ruby project which runs my main server at blue.mackuba.eu, which includes my feed service, the Handle Directory, statistics and so on.

    • New feeds: ATProto feed, ATProto projects feed, Kit feed 🐱, News From Mastodon, Bluesky Team Links feed
    • The two Replies feeds: Only Replies and Follows & Replies, which have gained a lot of popularity after the Following feed changes in the summer, when the option to show all replies in Following was removed, and which are now my two feeds with the biggest traffic
    • A work-in-progress Hashtags feed (for following hashtags, currently requires running a Ruby script to add tags)
    • Some continuous updates to existing feeds, like the Linux feed, where I’ve recently had to fix the name “RedHat” appearing in completely unrelated contexts 🫠
    • The Handle Directory, listing handles grouped by top-level domain, a global list of most popular accounts with custom handles, and a ranking of independent PDSes
    • The statistics page, showing daily and weekly post stats, later also separately for bridged posts from Mastodon/Nostr and posts on independent PDSes
      • I also periodically post stats of most popular languages on Bluesky – what % of posts are in which language, and how many users are there that regularly post in each
    • The very unstyled labellers list page, which was even briefly featured in a CNBC video for a few seconds 🤯
    • An automatically updating list of official .gov and .edu accounts with verified handles, plus an open source repo with the script that updates them
    • I also managed to learn some basics of Rust 🦀 and to write – with major help from ChatGPT – a tiny native module for Ruby in Rust, which I knew absolutely nothing about before, to speed up the matching of post contents to regexps (I’ll try to write a bit more about this at some point and share the code)
    • … and I’ve spent literally months working on migrating this whole thing from SQLite to either MySQL or Postgres, and I’m still not done 😩

    Skythread

    • Added a hashtag feed view, listing latest posts with the given hashtag, back when this wasn’t working in the Bluesky app yet
    • Similarly, added a post quotes counter and a view listing the post quotes, before that was added to the app (you can still use it now to peek at the quotes which are hidden/detached in the official app 😈)
    • Implemented link highlighting and link cards
    • Added special handling for bridged Mastodon posts, which are showing author’s formatted Mastodon handle and the expanded full-length post contents (since Bridgy stashes the full original text in a custom field in the post records)
    • Added an “infohazard” feature, which lets you load hidden comments which the official app hides because of the “nuclear block” between two commenters

    Some of these, like the quotes list and the hidden comments, use data loaded from the Bluefeeds server APIs – since I’m saving all incoming posts to one big-ass table, I can index them by quote source and by thread root, and can use that to look up all quotes of a post or all replies which are supposed to be there in the thread.

    Blog

    • I wrote a super long post “Complete guide to Bluesky” with a lot of explanations of how things work and various tips and tricks (and updated it a few times, as new features were added)

    Skyfall

    • 7 releases, adding support for new features like labellers or account state events
    • added support for streaming from Jetstream, parsing events from JSON instead of CBOR

    Minisky

    • 2 releases with small improvements (OAuth support is still waiting for a better day)

    New projects & tools

    • Label Scanner – a simple web tool that shows all labels assigned to an account/post by any of the known labellers
    • sdk.blue – list of libraries and SDKs for Bluesky/ATProto, grouped by language
    • tootify – a simple tool for selective cross-posting from Bluesky to Mastodon, which I use to mirror some of my Bluesky posts to my Mastodon account
    • didkit – a Ruby library for resolving handles to DIDs, loading DID info like assigned PDS host etc.
    • a bookmarklet to easily mute notifications in the Bluesky web app
    • two Ruby scripts to scan your home feed or some selected accounts and print the stats of who is posting how much daily

    Servers

    • I’ve migrated my projects from my old Swiss hosting Coin.host to German Netcup, which has VPSes with really fast CPUs and SSDs and a ton of bandwidth included for a really good price
    • I’ve set up my own self-hosted PDS at lab.martianbase.net, and migrated my main account there

    Oof… that’s a lot of stuff 🥲 What’s next, in 2025? I… I really think I need to work a bit more on work-life balance (sideproject-life balance?) this year. I feel like I’m doing way too many things. But I have no idea if that’s going to happen. I think at least I should try to do a bit more project finishing and a bit less project starting.

    First of all, I can’t wait to get out of this transitional state with that main service running on three different databases in parallel. I think I’m really close now (though I’ve been saying this for weeks).

    Then, I have all of the above to maintain, OAuth to add to Minisky & Skythread, things like better documentation and tests to add to the open source projects. I have a few ideas for longer blog posts to write, e.g. some kind of architectural introduction to ATProto.

    And also… I’d really love to write an AppView one day (but yes, I know, less project starting 😛).

    → 7:42 PM, Feb 10
  • Postgress progress

    I think I’ve finally made some good progress with my Postgres database. Things are working pretty well now, and I still have a few fixes in the queue. It’s not all finished yet, but I feel like I’m coming out of the woods now.

    For those out of the loop: I have a server which streams and saves all posts from the Bluesky firehose, and uses this data to run several feeds on Bluesky used by 1k+ people, plus generates daily post statistics, the directory of custom handles and things like that.

    This all has been running on SQLite so far (200+ GB file now), and it has served me well far longer than I would ever have expected, but in the end I’m hitting its limits. The main problem is lack of support for concurrent writes, which makes it unrealistic to e.g. split the work into multiple workers, which I’m gonna have to do sooner or later.

    I’ve only really used MySQL before on servers, but everyone was telling me I need to use Postgres, so I did what any normal person would have done (right?): I’ve set up two test servers, one on MySQL and one on Postgres, running from two modified branches of the code, and I made the SQLite-based production server proxy some requests to one or the other in order to test them on real traffic. A sort of database A/B testing 🤪

    It has been working like this for a few months now, and I’ve been slowly tweaking things, mostly on the Postgres side. Since I got it to work at all, they’ve both been working ok, but MySQL has generally been doing more write I/O doing the same things, and Postgres has been doing comparably more read I/O.

    The key part are the Replies feeds (Follows & Replies and Only Replies, it’s basically the same code). These two are kind of variants of the built-in Following feed, showing the connecting user their own chronological timeline, generated live, filtered in some way. These feeds work by fetching the requesting user’s follows list, either from the AppView or from a local cache, and then making a query that’s basically: “give me the most recent 100 posts from any of these 50/200/1000/5000 users”.

    There are two relevant indexes in that table: one on (repo, time) (repo = user’s DID), and one on just (time). Roughly speaking, for those users who follow e.g. 80 or 200 people, it makes more sense to scan the (repo, time) index those 80-200 times and collect the 100 most recent posts from all of those found, and for those who follow e.g. 9000 (yes, that happens 😛), it’s faster to scan the single (time) index until you find 100 relevant posts. But I’ve been struggling to make Postgres always use the right index.

    At first it tried to use the (time) index way too much. I tried to find a way to force it to use a specific index, but I was told it’s not possible (it might be though!). With the help of some people on Bluesky, I managed to rewrite the query to use the other index, but then it worked bad for those people with a lot of follows. I interpreted the EXPLAINs as the query planner not having a good enough info about the structure of the data (it estimated n_distinct, the total number of distinct values in a column, as much lower than the actual number of users in the table), so I started messing with the “statistics” settings. I ended up bumping up the statistics target to the maximum 10,000 (making ANALYZE take really long) and hardcoded n_distinct to 4.5M, which was closer to the total count. I also increased shared_buffers to 1 GB. This improved things somewhat, to the “not great, not terrible” level, but I was still not happy with the performance.

    But eventually, I was able to improve things a lot over the last few weeks:

    • First, I was re-reading one of the blog posts about the differences in how MySQL and Postgres organize data internally, and I had a bit of an epiphany: unlike MySQL, Postgres doesn’t store the primary key (id) in secondary indexes, since they reference the physical location of the data instead. In MySQL I had a version of the query which first did a subquery that collected just post IDs, and then used the final 100 IDs to fetch posts, but this couldn’t work here if the indexes don’t include id! This was something unexpected for me. So I replaced the larger index with (repo, time DESC, id) and rewrote the query again with a subquery, and the EXPLAIN showed “Index Only Scan” now, meaning it could use just the index for the subquery.

    • Next, the Index Only Scan was showing something like: “Heap Fetches: 31362”. This apparently means that even though it tries to only use the index in that step, it still has to go fetch 30k post rows from the actual table. This is because as records are updated or deleted (in my case only really deleted), Postgres just marks them as deleted, but doesn’t free up the space or update the index until a VACUUM is done. A separate structure called “visibility map” tracks which table pages have some outdated records in them, and if an index entry points to such page, it has to go check if that record is still there. So I started tracking the state of this visibility map, and set up a cron job to do the VACUUM twice a day to keep it relatively “clean”. (I wish it was as easy to do with my apartment…)

    • Now, it was working great for people with short lists, but still tried to use the (repo, time, id) index for most very long lists. The EXPLAIN showed that it was vastly overestimating the cost of using (time). I came to the conclusion that… the hardcoded n_distinct was actually working against it now, even though it was technically more precise. In my practical use cases, what actually matters is not how many different users have at least one post anywhere in the table, but how many are actively posting - whose posts you can find in the last few hours or a day. And that’s much less than 4.5M. So I cleared the hardcoded n_distinct and let it find its own estimate, and reset the statistics target to the default 100 - and this improved things quite a bit! It was now almost perfectly using index_posts_on_repo_time_id to around ~1200 followed accounts, and index_posts_on_time above that.

    • Not related to the query/Postgres directly, but I also added a “prefetching” task to the follows list cache: it runs every 15 minutes and checks which follows lists are approaching the point where the cached data becomes stale (but was accessed recently) and reloads the list from the AppView in advance. This saves a couple to several seconds of time every time the list had to be refreshed before returning the feed, so it made quite a lot of difference.

    • One last thing I need to fix is that for long lists, significant time is spent on query planning (e.g. 1724 ms planning time and then 12 ms execution time). I’m gonna try to restructure the query to pass the list of user IDs differently somehow and see if that helps, or perhaps I could move the cache to a new database table instead of keeping it in .json files.

    • And finally, I have some more config tweaking to do - e.g. it seems shared_buffers could be bumped up to at least 2 GB here (with 8 GB total RAM) - but I wanted to first make sure it’s doing the right things before I give it more space to cache it.

    I’m really looking forward to when I can finally set up a production instance with Postgres and move everything there and drop the other branches - but I feel like this moment is getting close 😅

    → 10:39 PM, Jan 16
  • new domain

    who dis

    → 8:49 PM, Jan 8
  • test post

    test 123

    → 5:52 PM, Jan 8
  • Hello world

    just setting up my mcroblog

    → 7:55 PM, Jan 7
  • RSS
  • JSON Feed
  • Micro.blog