following: 37
followed tags: 43
followed domains: 5
badges given: 28 of 76
hubskier for: 4313 days
Yeah, I’m not too surprised about that result. It’s hard to beat 4.9 years of average experience with AI in anything, code or not. Someone coined the term “plastic software” the other day to describe AI-generated code: easy and cost effective to make, but very inflexible and hard to work with once it’s made.
https://x.com/zohrankmamdani/status/1857410216733651146 From Ryan Broderick this morning. Mamdani released a video last year right after the election where he went to the New York City neighborhoods that supported President Donald Trump the most. And it’s a fascinating watch because the majority of what Trump supporters said they wanted from Trump are almost verbatim what Mamdani ran on: Lower prices on food and gas, cheaper rent, and an end to the conflict in Gaza.
Now, of course, Trump is a liar and nothing he’s done over the last six months has made life easier for anyone, let alone working class Americans, especially the immigrants in the neighborhoods that Mamdani visited, but Trump’s promise to do something about those issues clearly resonated with people. And it’s, frankly, shocking that it took a leftist candidate this long to capitalize on those false promises. Definitely seems like a better strategy than whatever the heck Project 2029 is.
Totally agree. I mean, imagine if the Dems ran on something like this? But replacing halal with groceries?
The thing I’m missing from the Zohran discourse is that they stay approachable, calm and collected while also discussing issues that voters actually care about. There’s an article I can’t find now about the success of some Italian(?) mayor who won by a landslide against a populist screamatron by simply promising to once and for all fix the trash problems plaguing the city. Outrage has little to say against pragmatism.
I briefly considered sending you that album, before realizing there’s no chance you wouldn’t already know about it. So far I like the other Origins better, but it can also just require some time to grow on me. Not a fan of the lyrics personally.
Greetings from Belgium. Doing a four-day themepark trip so I’m living my best life. If all goes to plan I’ll have over 50 new rollercoasters under my belt this year alone, which is more than my entire life up to this year combined. My white whale is to one day go to Sandusky, Ohio to ride Top Thrill Dragster (now slightly altered to TT2) because there’s a ten year old kid inside of me who needs to fulfill his dream of one day riding what he saw on the Discovery Channel. I’m also getting more into meditation. I now regularly do 45 minutes when I wake and at dusk and it feels right to do so, in a way that logic can’t quite explain. I’m not using a meditation app like I did when I meditated years ago, but I am learning about yogic and Buddhist traditions and trying them out with a nonzero amount of conviction. For example, meditating with incense is better and I can’t explain to you why. (It’s worse for the poorly ventilated attic I meditate in because the smell lingers in forever. So I’m not gonna do that on the reg.) The funny thing is that my mom has been into spiritual and New Age stuff for so long that I have grown a natural dislike for anything remotely related to chakras. But fuck, keeping up my practice and slowly increasing detachment is yielding me discipline, energy and a level of mental calmness and clarity that I wanna keep walking down this path, even if just to see where it leads to. Earlier this week I went to a small get together for a newly formed TTRPG club/Discord in my city. It was great, almost everyone was 30 something (which leads me to hypothesize the D&D 5e wave is squarely a millennial thing) and it was fun meeting new people. I’ve never done so before but I signed up to help figure out how to to build this small and fragile start into a community of some sort. This piece has been helpful already:
Yeah that 5-hr drive (plus at least 30-50 minutes of breaks I hope) is just over 3 hours with the high-speed rail from Amsterdam to Paris, despite it having 4 stops along the way. I'd rather hop on a train and read a book or something. Back when I was a student and lived near Rotterdam, it took exactly the same amount of time in minutes to travel to my parents in the north of the Netherlands (a 2hr drive) as it took to take the train to Paris (a 5hr drive). HSR is wild like that. Realistically, it only works well between 100 and ~400 miles. Barcelona-Malaga is 7 hours for 620 miles and I've done much longer HSR trips than that? But most people will still fly. Hell, I seriously considered taking the train to Malaga from here. I can technically get to Barcelona in one day. Malaga would require an evening train to Paris and a full day of trains from Paris to Malaga (two trains of each 7 hrs), which is doable for your average climate fascist, but it doesn't hold a candle to a 3hr flight. I just think that even in the US, there are a ton of trips in that 0-400 mi range that don't need to be flown. Your weekly urban system might've stretched all the way from Seattle to LA, but that's the exception and not the rule.
I don’t buy the “US is too big” argument personally. The bigger issue is that good/heavy/highspeed rail infrastructure begs for on-par lower hierarchy transit options and those are few and far between. But NY to DC is an absolute no-brainer.
They are now being forced to not delete any chats anymore due to a lawsuit. So at least you can't looksmaxx anymore in private...
I do think it's hard to fundamentally change open models to prevent this? My understanding is that with open weights models, you could in theory just put the model back in the oven to train the safety features out of them again. (Although I might be wrong about that.) Facebook is entirely to blame for opening this particular Pandora's box though.
BUT WAIT IT GETS WORSE One developer, who only goes by the name Lore in their communications with the media, described the open-source release of the large language model (LLM) Llama as creating a “gold rush-type of scenario”. He used Llama to build Chub AI, a website where users can chat with AI bots and roleplay violent and illegal acts. For as little as $5 a month, users can access a “brothel” staffed by girls below the age of 15, described on the site as a “world without feminism”. Or they can “chat” with a range of characters, including Olivia, a 13-year-old girl with pigtails wearing a hospital gown, or Reiko, “your clumsy older sister” who is described as “constantly having sexual accidents with her younger brother”.
An addendum:
There are two anecdotes that color my opinion here. 1. If you'll recall I worked for a while inside a tech startup building a mobility app. Dev work was split into a small onshore and a large offshore team in India. The onshore team was, essentially, the product manager and four coders of varying seniority. The offshore team was something like 20-30 fulltime developers? If there's anything I'd want to get built or get changed I had a feedback slack that the onshore devs would process into Jira. If there was anything big that I needed to be implemented, I'd get the onshore junior dev to essentially parse my request into Jira for me, often after a few meetings because chopping my idea into baby-sized steps is never a straightforward task. It'd be well-documented what the desired change was, and what the steps were to get there before it went to offshore. Then a week or so later they'd be working on it and have a bunch of extra questions. Then I'd get a new TestFlight version of the app, I'd do a bunch of testing with that, and would often come back with half a dozen edge cases and misinterpretations. Back and forth, testing, questions, back and forth, testing and then it would usually be fine. 2. A good friend of mine works for a Dutch competitor to EPIC. She graduated CS just over a year ago. Basically landed the job as a college intern and was allowed to stay. The first months of her job, she was not allowed to write code, instead just having to review other people's code in order to initiate osmosis for the inner workings of EHRs because there is, essentially, no documentation. The entire company from what I gather seems to operate on tribal knowledge, the elders passing down quirks and edge cases that stay in. It is also company policy to forbid writing any documentation in the code. Instead of documentation, they've created a layer cake of processes that code has to go through to be reviewed and checked. The processes are well-defined, but what goes through it could be whateverthefuck. Most of the time, the changes are very small. They're scared to accidentally break anything in production (and hey, rightly so!) but that also means they rarely change anything large, so the codebase is never refactored and is a layer cake too of cruft accumulating by hundreds of individual developers making changes and writing code in their little corner of the monolith in their particular way. You don't think you can include that in the prompt? Or documentation it reads? I had to regularly instruct the offshore team to do things in a particular way in the app, because I was using a third party tool for our analytics and that tool needs data in a specific way to work. "Yeah I know this is not ideal but can you please define sessionID as text, even though it always consists of numbers only?" The timeline in the past half year or so has been wild, in my humble opinion. When the term vibecoding was coined this January, Claude 3.5 was the best we had, which is (still) quite good at writing a few dozen lines of code but gets lost as soon as it's even slightly bigger than that. My first vibecoding experiences were...rough. The improvements in the past months have been incredibly siginificant but specifically for coding. For my first vibecoded app I had a prototype in 2 minutes and would spend a good hour or two debugging by "there's a new error, go fix it" again and again. For the OVguesser app I mentioned in Pubski? I had a prototype in 2 minutes and...essentially only one or two bugs, despite being a more complex app with 50+ files spread across client and server. I could go straight to "this is great, let me tweak it until I'm happy". Adding reasoning, better prompts, and most of all tool use (you tell the AI how it can Google something, how it can grep a file, how it can use any tool you can imagine) has dramatically improved its ability to do the vast majority of software work. I used to be able to tell AI-gen code from regular code apart. I lost that ability - there are no longer six fingers on the hand, the lighting isn't "off" anymore, especially not with the right prompt. When I showed my EHR friend this week the code Sonnet 4 produces, she too could not find anything bad about it. It genuinely just writes decent code now. The size of what it can write has dramatically increased, from autocompleting your sentence in an IDE to writing functions for you to, now, being able to writing an entire app out of nowhere that sometimes actually works. Now - that does not mean Jesus can take the wheel. I know. The main issue, now, is that the agents are not very good at exploring the solution space. When the code base becomes larger, they often struggle to take the logic of line 265 in file A into account when writing line 1,038 in file B. Or they are too eager to jump to the first solution that sounds remotely like it could work. So you end up with short-sighted solutions that break something else somewhere. It really, really needs checks and balances now to prevent the  's from happening but let's be honest, do you realized how often shit breaks in normal software developers? Are the blanks you are getting from the EHR API any better than the  's because those shit sausages were made differently? Even if there is not a single inch of progress in the models, I'm fairly certain we'll still see progress in the coming years in the ability of AI to improve software engineering. What I didn't know, until reading this article, is what that future could look like. I don't think the blog is a blueprint? There's every chance it will be kneecapped in multiple ways? But I'd be surprised if this is not the direction we'll be heading in for the next few years. Right now we're in the It Moves Fast And Breaks Things era. But that era will end and I'm intrigued slash terrified slash in awe of what that might look like. In a way, some of this is already here and working, just in small pockets. NotebookLM's podcast feature is a single button to the user but behind the scenes it is basically an agent cluster that takes a document, creates a podcast script, refines the script to add uuhs and ahs and other vocal nuances, and text-to-speeches it. Not just in a single pipeline, but even on-the-fly when you "call into" the podcast. You take a complex task, break it up into its constituent parts, and refine an agent with a specific agent-prompt and toolset. Then you tell the smartest AI you can afford "these are your minions, go do this overarching task" and have only that one talk to the human in the loop. From my experience, both this concept of agent clusters and vibecoding feels eerily similar to the way I worked at that mobility app. The onshore dev I worked with the most was a junior dev. He wasn't particularly bright or experienced (he was the same age as I at the time) but he had a) a few tools at his disposal b) he knew what the architecture of the app looked like and c) he was somewhat good at reasoning about code (but I'd often be just as good). His task was to pour my request into the molds of the Jira processes they were used to and he would delegate everything else to his offshore colleagues to actually do. Along the entire process of going back and forth with the junior dev and at times with the Indian devs I'd do nothing different from what the article describes as agent babysitting. And the results of that process was often just, like, implementing 1 new class or function call in the backend. I also don't think my friend at Not-EPIC should be unworried. She knows how wretched the way they develop is. She doesn't know what her code does IRL or what workflow it could break. This fall she has a surgery coming up which will knock her out for a good three months. That means for three months, there are exactly zero people available to deal with problems in her corner of the monolith. When people leave the company (because of course they under-pay and over-ask) it often results in a plethora of problems because the next person put on that bit of the system breaks a bunch of shit because there is nothing documented. My expectation is that management will, sometime in the next years, realize that they can actually fix the fundamental problems with their organization and get more done. Get everyone to talk through the code they manage, record and transcribe it all, and autogenerate the mother of all documentation and mandate docs updates from there on in the code change process. Now you're no longer dependent on tribal knowledge and expensive senior devs. Then, talk to every client about all of their wishes they have, for as long as the client wants. Record and transcribed it all and generate the mother of all backlogs and requirements. You give every medior or even junior tasks from that list and assign them an agent cluster, only requiring the senior devs for the aforementioned checks and balances. They could even take all 8 hours of a work day just for picking the tasks and for checks and balances, and have the agent cluster work throughout the night to provide new code to check and balances. But I highly doubt all of that will require more devs.Are... you going to explain to the AI fleet why that bug has to stay?
The upcoming wave, which I'm calling "agent clusters" – the chariot I hinted at in the last section – should make landfall by Q3. This wave will enable each of your developers to run many agents at once in parallel, every agent working on a different task: bug fixing, issue refinement, new features, backlog grooming, deployments, documentation, literally anything a developer might do.
I'd love for this to be repeated with a piece of text that is not that. A friend who works as an English teacher at a community college pointed out that there might be a bit of cognitive bottlenecking going on - the inability to parse the text declines very quickly once you go above a certain percentage of missing vocabulary. This comment on there also adds some extra nuance: I fully expect the underlying point, that college student literacy has collapsed, is true, but I think the people who designed this test failed to build it in a way that it could possibly prove or disprove their hypotheses. At this level of difficulty, reading a text like this is a test of subject matter knowledge and reasoning ability, not literacy or English language skills. I used to get poor marks in French listening, speaking, and writing, but ace reading comprehension, because I had general knowledge and the ability to reason well with incomplete information. It wasn't reflective of my French skills when I overperformed in one of four French tests.I'm inclined to be very sympathetic to the students here. Those paragraphs may as well be in a different language, they're filled with words no-one uses anymore, or which are being used in ways these kids have never seen before (if you haven't encountered "whiskers" or "wonderful" in these contexts before you won't think to look them up in the dictionary just in case) and it's entirely reasonable for college students not to have learned the words for 19th century phenomena they will never encounter in their own lives (horse blinkers, Michaelmas). Even the fact they had a dictionary isn't the catch-all excuse you want it to be because if they have no incentive to get these questions right they won't be motivated to do twenty minutes of linguistic archaeology to answer these questions.
Who's got two thumbs and decided this year's big project would be to get 16 governments to sign a multi-million euro contract together to establish a regional bikesharing system? This guuyyy. The good parts are that I'm essentially in the lead for deciding how the system will look like, as I've written of the requirements specification and partnership agreement. There's a blend of pragmatism, idealism and knowledge of the abstruse ways bikesharing works that I'm proud to have been able to put into it. The downside is that when you run a pack of 16 governments in a particular direction, there's a significant risk of the pace being defined by the slowest of the pack. And the biggest municipality in this ordeal has been plodding along with a breathtaking lack of competence even after escalating it to the highest level possible. I think we'll find a way? But I'm not as sure about that as I'd like to be. I vibecoded a game this week! Inspired by mike's game I wondered if there is a game like that that I'd like to play and make. Being a bit of a public transit ("OV") and of course mapping nerd, I ended up with the idea to take GeoGuessr and apply that best-of-5 game mode to guessing the location of Dutch train stations based on the name alone: OVGuesser. If you're really good at triangulating you might get some points, but I've made it kinda hard on purpose. The combo of Replit and Claude 4 meant I had a prototype up and running in, like, half an hour? Most of the rest of the time was me finnicking with the UI and UX to get it to look and feel like I wanted it to. It did break spectacularly when I shared it (first on Linkedin, because who isn't on Linkedin for distractions?). Since I never said I wanted the game to be scalable, the AI decided to give each game session its own sessionID stored locally. Which worked fine when I played it solo but caused every single person to be unable to progress past their first guess to the next station. Cue me being unable to panic-fix that issue. In the end I realized pretty quickly that scalability must be the issue. Refactoring AI-generated code is...not great at the moment. Claude 4 is very cheery and happy about everything it makes, and for sure not critical enough about its own capabilities. In the end I just got Gemini 2.5 Pro to act as the sceptical senior dev, double-checking everything to make sure the refactor actually solved the problem. That worked much better than I expected. I'm still kinda surprised that the word vibecoding was coined just this year. It is very obviously an imperfect system? But man is it fun and quick and good at side projects like this.
Not bad for a first try? What’s the median score? LAST WORD May 22, 2025 I scored 9 words! 🟡⚪⚪⚪🟡 🟡🟡⚪⚪⚪ ⚪🟡🟩⚪⚪ 🟡⚪🟩🟩⚪ ⚪🟩🟩🟩⚪ ⚪🟩🟩🟩🟩 ⚪🟩🟩🟩🟩 ⚪🟩🟩🟩🟩 🟩🟩🟩🟩🟩
We finally finished making the wedding photo album with the photographer the other day. I'm still wildly impressed with the photos she has taken, in particular their ability to evoke the emotions of that day even after poring over them again and again for the album. I'm try to give myself more room to think lately, to meditate on nothing in particular. My daily routine has now included a morning meditation when I wake up for most of the past six months. It is slowly paying some dividends by instilling more calmness throughout the day. As my wife is going through some things (an ADHD diagnosis to name but one thing of many), I'm glad I'm there to support her to the best of my abilities.
Looks like the model has good advice for Jony Ive Cool project. I am kinda jealous because making that kind of segmented display is an idea I was floating around in my head, until I realized nobody is selling arrays of 'em and electrical engineering pf that degree is voodoo magic to me.
You've at least made me realize that I for too long have approached music like incels approach the other gender: why won't it just come to me, I'm over here doing nothing right. I used to be much more into music when I went to college, listening to indie and college radio stations. Actively seeking out new music for my own music collection on Google Play Music. But the radio stations became worse over time and my [favourite (government funded!) music site](https://en.wikipedia.org/wiki/3voor12) had a few bad years drowning in the social media algorithms and shittier writing. Then they deleted their weekly curated playlist and moved to a weekly radio show and they lost me completely. My usual avenues of exploration didn't work for me anymore so I mostly gave up. And I think that's what that author's capital sin is, too. At some point when Google shat the bed on GPM I finally capitulated and moved to Spotify. That transfer was rough enough that I still feel like I'm missing at least a third of the music I used to listen to. Spotify allowed me to listen to more music (without user-ripped mp3), but The Spotify algorithm had been absolute shite at finding new music for me. Discover Weekly for years would serve me either music I already knew ("hey do u liek Fleetwood Mac?") or music that didn't stick. I only found out the other day that Spotify has an internal KPI for the licensing cost per million streams. They heavily favour their algorithms to push you to music that you might like that is also the cheapest for them to stream. I feel like it explains why I dislike their playlists so much, but it's hard to prove. The underlying mistake both that author and I were making is holding on to the idea that there is some common, shared Good New Music that will just come to you without being active. Our feed-wired brains have been gradually eroded to the point we're now thinking "but what do you mean you seek OUT things??!??" I think it also doesn't help that whatever music media landscape there was is now well and truly dead. You can't just sit back and let the zines decide your taste for you, because there's almost zero chance that your friends will find the same bands in the same way. Curation isn't centralized anymore, and that's great! But it means lazy people have it harder. I just spent a good hour going through a few of 3voor12's recent articles and new music. Already found more new music today than in the past weeks. whoddathought!And you know what? It's fucking easy. Find a podcast you like. Find a Mixcloud DJ you like. Find a Soundcloud DJ you like. Find someone to follow on Spotify. Find someone to follow on Last.FM. ASK YOUR FUCKING FRIENDS.
I’m not trying to be preachy or be an armchair shrink, but it might be good to try and not game for a while if you’re feeling aimless. Games are very good at satiating your desire for community, challenge, accomplishment and autonomy, without actually giving you the real fulfillment that propels you forward in life. It’s like taking artificial sweeteners when you actually need the real deal. You have the motivation in you to do something else, but right now it sounds to me like part of the problem is that your motivation is hijacked by playing so much WoW.