How to check divisibility by 65521 without a remainder operation:
bool div65521(uint32_t n) {
const uint64_t c = 1 + ~0ULL / 65521;
return n * c < c;
}How to check divisibility by 65521 without a remainder operation:
bool div65521(uint32_t n) {
const uint64_t c = 1 + ~0ULL / 65521;
return n * c < c;
}The belief that men not moved by a sense of duty will be unkind or unjust to others is but an indirect confession that those who hold that belief are greatly interested in having others live for them rather than for themselves.
abysmal /əˈbɪzml/
Merry Christmas!
How I see APL is as an inherent and obvious product of programming language evolution.
Once you start with Assembly and get good enough, you notice that you no longer think in terms of opcodes, GOTOs or registers; instead, you think of variables and high-level control structures (if this then that). translating mental pseudocode to assembly becomes tedious.
Then, we saw the advent of higher level languages like C: you no longer have to spend so much intellectual effort on translating ideas into code, as the compiler can do it better. Further, the gist of the idea can be understood better by most programmers by seeing more structured code than assembly.
After that, we realised that writing a hash map from scratch in every source file is also an inefficient use of programmer time, as humans find it easy to conceptualise basic data structures like a dictionary. Unfortunately, this is where most programmers nowadays plateau: when writing code, they think of for loops, hashmaps and if statements. The average programmer deludes themselves that they are not bottlenecked by their typing speed or their skill at translating abstract ideas into code: they see programming as the act of such translation without regard to the art of reasoning in abstraction.
The CS-savvy programmers discover functional programming. They learn about recursion, which greatly simplifies many structural operations. But at some point they notice that recursion as a concept is powerful and low level: difficult to debug, difficult to reason about, difficult to match patterns on, verbose. At this point, they discover recursion schemes and become efficient at juggling around maps, filters, scans and reduces. They make it easy to do basic strength reduction and optimisations, change existing behaviours and add components to data processing pipelines, on top of being rather easy to reason about to a skilled programmer. But many people don't ever get to this stage - see the amazement at "ngn scans". And then the ultimate step of this evolution is noticing that most of these maps that hide recursion deep down are not necessary and if only data was arranged in arrays, the loops could be tucked away inside of even the most primitive operations. This is what makes APL relevant and so representative of human thought - it minimises the amount of intellectual effort that is required to turn abstract ideas into code. It lets programmers focus not on the bread and butter of computer science but further develop ideas and improve as scientists and problem solvers.
But - as I mentioned before, most people stop at the third stage: they can never efficiently reason about abstract ideas or very high level code, because they're not representative of their mental model which was shoehorned into ALGOL-style programming. They never solve problems abstractly - they think in terms of code, which has high degree of cognitive overload due to the need of mentally materialising all the ceremony and magic that's related to inherently verbose mainstream languages. And this is why they never feel bottlenecked by the speed at which they write stuff. Hence, APL simply doesn't work in a commercial, group setting. Readability in the common sense is a dial that you can twist between programmer convenience and efficiency and how digestible the code is to the average person. Unfortunately, you will be hiring average people and you will not meet two efficient programmers with the same mental model, so the idea of abstracting away reality doesn't work. You have to agree on the lowest common denominator in terms of abstraction levels that makes working comfortable for everyone: it's easier to adopt a (simplified) shared common mental ground than get everyone to agree on the local zen of code.
But maybe forcing people to agree on the "zen of code" would be a good thing: so-called readable code is meant to simplify onboarding and development of code because programmers don't need to spend a long time figuring out how the whole thing works. That may be a bad thing: programmers may delude themselves into thinking that appearing to work is the same as working, and hence introduce subtle bugs into the code base that they don't really understand, which the upfront familiarity would have de facto required. This is a common trend in programming right now. I don't agree with it, because every code base builds up their own non-standard set of primitives in utility classes that take a long time to grasp, while APL regardless of where you use it is mostly the same and idiomatic due to the fact that primitives actually do things.
Bottom line: Is becoming more efficient at programming and abstract reasoning through the use of better tooling a ever goal for programmers? There is no tangible benefit from being specifically faster at greenfield programming. I find it desirable because I am a young programmer and first and foremost a scientist, but I would imagine that my more senior colleagues don't have a reason to chase this - they work on established code bases that grew hairs and limbs over the years, just like every commercial/long-term project and don't ever have the drive to return to greenfield programming.
Rediscovery: a reasonably good RNG in under 300B of brainfuck code:
>>>+++[<+++++>-]>+[<[-]<[>+<<<+>>-]<<[>>+<<-]>>>[<<<+>>>-]<<<[>>>++<<<-]>>>[<+>-]<[>+<<<+>>-]<<[>>+<<-]>>>>>+++++>+<[<<<<<+>>>[>>>>>+<<<<<-]>>>>>[-[-<<<<<+>>>>>>]<]<[>]<<-]<<<<<[>>>>>+<<<<<-]>>>>>>-<[-]<<[<->-]<<[>>+<<<+>-]<[>+<-]>+>>[>>>>>+<<<<<-]>>>+>>[-[-<<<<<+>>>>>>]<]<[>]<-<<<[<+>-]<.>>]We should combine Malbolge, Haskell, CUDA, MIPS assembly, Lean and Verilog to create a new fantastic programming language.
Progress on Data Compression in Depth:
106 118 2777 ./main.tex
2632 27535 184506 ./content/chapter3.tex
3635 37047 273143 ./content/chapter5.tex
33 321 2255 ./content/appendix.tex
14 796 5391 ./content/foreword.tex
484 5366 35317 ./content/chapter7.tex
1316 11660 77651 ./content/chapter0.tex
1488 11531 76060 ./content/chapter4.tex
1460 11414 83695 ./content/chapter1.tex
3021 27776 186147 ./content/chapter2.tex
2112 15631 113221 ./content/chapter6.tex
247 3365 22936 ./auxiliary/scratchpad.tex
16548 152560 1063099 totalPicture an instant messenger (e.g. Discord). Open a random channel in a STEM-related server.
“A: Hey, I need help with problem [X]. Please DM me.” “A: Hey, I need help with problem [X]. B: Sent a DM.”
Sounds familiar?
It has always puzzled me why would people rather offer help via direct messages instead of discussing the subject in public. After all, this way the answer can benefit everyone reading who may be asking themselves the same question. Also, there may be some people who will ask follow-up question, and yet another person to answer them. Perfectly logical.
Now, step back for a second. Depending on how long you have been on the internet, this may annoy you even more than some of the freshly-recruited STEM hobbyists. Since a lot of programming forums are dead, spammed with useless content or poorly moderated, it’s becoming more difficult to just look the the question online. Have you ever found yourself joining a community and immediately after searching for your problem in the message logs? When you really think about it, is it actually true that there will be many people who will benefit from the answer? After all, we can’t efficiently index Discord servers too. And what is the likelihood that two people with the exact same problem will be browsing the channel at the same time?
Someone demonstrates a simple C program along these lines:
#include <stdio.h>
int main(void) {
char a = 0x12, b = 0xAB;
printf("%04X %04X", a, b);
}Contrary to what they expected, this program prints 0012 FFFFFFAB. Why does this happen? Pretty simple - so you start explaining: printf wants to print a variable of type unsigned int, but the variables you have declared are signed chars. So the signed char has to be widened to an int, but in 2s complement 0xAB is a negative number, hence to preserve the sign it’s padded with 1 bits at the front (i.e. sign extension; compared to padding with zero bits - zero extension). You suggest that marking the variable as unsigned char will solve the issue.
Before your suggested solution could even be tested, someone interjects. After all, the C standard does not guarantee the signed number representation. Further, the standard doesn’t specify whether a char is signed or unsigned, so it’s incorrect to assume that it’s signed. You are wrong.
Annoyed yet? And you have been unlucky too, because the pedantic interjection is actually technically correct. These kinds of exchanges stem from the (exceedingly common!) pedantry in educational settings coming from complete misunderstanding of didactics. Practically speaking, simple problems that have difficult explanations usually need some degree of simplification to be digestible by a beginner. Unfortunately, such simplification, for a sufficiently simple subject, leads to you technically saying things that are not entirely correct. This makes “safe” explanations really fucking difficult, if not impossible.
Finally, the flip side of this is the fact that pedantry and inability to temporarily accept axioms never actually leads to learning. Learning is a deeply unscientific process, driven by empirical experimentation, simplification and iteratively patching the voluntarily created gaps in one’s knowledge. In other words, you probably want to learn like BFS with pruning, not like DFS.
Takeaway: People would rather explain simple problems in private to avoid having to deal with idiots butting in.
Really happy with how my book is coming together. I spent a lot of time on it. Maybe publishing it is not as remote as I thought it will be. Also happy pride month.
Every mathematician believes that they are ahead of others. The reason they never state this in belief in public is because they are intelligent people. On the contrary, many programmers do state such a belief in public, because the median programmer is not intelligent.
Typesetting my book got me crying a few times.-
I must be the only person on the planet who eats olives straight out of a jar as a snack. I love being an independent adult who lives alone.
German business model was based on cheap energy from russia, cheap subcontractors in eastern eu and steadily growing exports to china. all three are gone by now, but german politicians are still stuck in a world that doesn’t exist anymore. So now after the whole country has been turned into a smelly coal-burning pit thanks to fake reports about nuclear and understating the coal plant emissions by 200x, there's no going back and germany is sooner or later going to level with eastern european countries.
wow, Windows ACLs are real nasty to work with from within C and WinAPI.
The mistakes i made when developing bzip3 are as follows:
If you are working on a project, do your best to never burn a bridge. Try to not take limiting design decisions unless you are absolutely sure of them. Document future potential for improvement, but focus on getting a MVP first.
I came back home! A few highlights of my trip to wroclaw:
Controversial stance: I really hate grug developers. Especially the species that will never learn how anything computing actually works. Like a hashmap/hashset, vtables, literally anything. They will discourage others from learning it because it's useless, and will not learn it themselves because if you roll your own hashtable then it will be buggy.
abysmal /əˈbɪzml/
- extremely bad; appalling. similar: very bad, dreadful, awful, terrible.
- literary: very deep. similar: profound, complete, utter, thorough.
My life feels like a kaleidoscope. Every time it twists, it becomes completely different from how it used to be.
One thing i feel bad about: none of my current projects that are ongoing are public so far. I feel like the stuff i don't put on my github gets excessively little publicity. Actually publishing stuff gives me a sense of fulfillment in that i actually made something and it's gotten to the state in which it's usable, a state which is usually not shared with the stuff I deliberately choose to not publish. Also working on my book is a neverending process. I poured so much time into that.
"Don't take criticism from people whose advice you would not follow" is something i should start living by. It's difficult to tune out the noise and not be let down by it, and I should actually crack on the things that need to be done.
Planning to spend my evening tomorrow with my friends having a board game night together. It kinda sucks that so little people even want to play board games. Almost everyone i know would rather play something on their pc. My jet lag from having been to canada for a month has still not healed. Hopefully it gets better because I have some errands to run.
Benjamin installed more RAM in my computer, now I have 40GB. Relatedly, the amount of Turing machine approximations available on the market is rather disappointing.
Favourite song of today: Toshiki Kadomatsu - Fly by day
Today we ate dinner in the CN Tower restaurant. It was really good. The place where you eat spins around so you can see the whole city. Unfortunately, during our booking it was raining, so you couldn't see much.
Two-day long city break starts today, we plan to catch the train at 12:00 and stay there for ~2 days. Then on the 9th of April I have to go back home in Germany :(.
We made pizza for dinner. It was all gone in ten minutes.
Hexdump of a 526-byte Mersenne Twister program (ELF64, takes decimal seed as input, writes output as a sequence of 4-byte LE fields). Approx. 1.5GiB/s on my machine. I wonder if I can do better.
00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000 .ELF............
00000010: 0200 3e00 0100 0000 7800 4000 0000 0000 ..>.....x.@.....
00000020: 4000 0000 0000 0000 0000 0000 0000 0000 @...............
00000030: 0000 0000 4000 3800 0100 4000 0000 0000 ....@.8...@.....
00000040: 0100 0000 0700 0000 0000 0000 0000 0000 ................
00000050: 0000 4000 0000 0000 0000 4000 0000 0000 ..@.......@.....
00000060: 0e02 0000 0000 0000 ce2b 0000 0000 0000 .........+......
00000070: 0010 0000 0000 0000 488d 358f 0100 0048 ........H.5....H
00000080: c7c2 1000 0000 0f05 31c0 4489 c10f b60c ........1.D.....
00000090: 318d 51c6 83fa f672 0d6b c00a 41ff c001 1.Q....r.k..A...
000000a0: c883 e830 ebe4 8905 6221 0000 4831 c948 ...0....b!..H1.H
000000b0: ffc1 4c8d 0555 2100 0048 81f9 7002 0000 ..L..U!..H..p...
000000c0: 740f 69c0 b517 0000 4189 0488 48ff c1eb t.i.....A...H...
000000d0: e866 bb70 0231 c041 b900 0000 804c 8d15 .f.p.1.A.....L..
000000e0: 2201 0000 4831 ff48 ffc7 6681 fb70 027c "...H1.H..f..p.|
000000f0: 3f31 c948 81f9 e300 0000 743e 418b 1488 ?1.H......t>A...
00000100: 4421 ca45 8b5c 8804 4489 db81 e3ff ffff D!.E.\..D.......
00000110: 7f09 d3d1 eb41 83e3 0143 8b14 9a41 3394 .....A...C...A3.
00000120: 8834 0600 0031 da41 8914 8848 ffc1 ebc3 .4...1.A...H....
00000130: 480f bfcb 418b 0c88 eb7f 31c9 4881 f98c H...A.....1.H...
00000140: 0100 0074 3b41 8b94 888c 0300 0044 21ca ...t;A.......D!.
00000150: 458b 9c88 9003 0000 4489 db81 e3ff ffff E.......D.......
00000160: 7f09 d341 83e3 0143 8b14 9a41 3314 88d1 ...A...C...A3...
00000170: eb31 da41 8994 888c 0300 0048 ffc1 ebbc .1.A.......H....
00000180: 8b15 442a 0000 4421 ca8b 0d7f 2000 0041 ..D*..D!.... ..A
00000190: 89cb 4181 e3ff ffff 7f41 09d3 89ca 83e2 ..A......A......
000001a0: 0141 8b14 9233 1593 2600 0041 d1eb 4431 .A...3..&..A..D1
000001b0: da89 1513 2a00 0031 dbff c389 cac1 ea0b ....*..1........
000001c0: 31ca 89d1 c1e1 0781 e180 562c 9d31 d189 1.........V,.1..
000001d0: cac1 e20f 81e2 0000 c6ef 31ca 89d1 c1e9 ..........1.....
000001e0: 1231 d189 c289 0c96 ffc0 3d00 0800 000f .1........=.....
000001f0: 85f5 feff ffba 0020 0000 4889 f80f 0531 ....... ..H....1
00000200: c0e9 e4fe ffff 0000 0000 dfb0 0899 ..............Applying any sort of entropy coding to lzw output seems like a lot of pain. Nominally you have a lot of invariants like e.g. US-ASCII packets that you could efficiently use FSE for to get a double digit CR, but the match packets (>256 + 2 for table clear and promotion) are so just random. the "simple" scheme of inserting trash into the LZW dictionary on almost every byte is moot and you end up with some sort of weird 16-bit alphabet to compress with sparse entries which occur once or twice per block. and then there's the invariant that all the values are <= than currently seen maximum since last table clear, which UNIX "compress" seems to use for some sort of primitive entropy coding where you begin with 9-bit codes, promote all the way up to 16-bit codes and then freeze the dict and scrape it up when CR drops below delta of 1-3%. for purely technical reasons my program seems faster/stronger than unix compress but it's a low bar. there's nothing you can really do with this format, i can see why people ditched it in the 90s and never came back. ... but now i'm reading up on some interesting schemes like LZW with 12-bit dictionaries that are selected by o0...o3 context and it clicked in my head that LZP/ROLZ are basically LZW, also looking into some other variations on LZW. probably won't figure out a nice way to do EC on the LZW output though.
weird: fwrite(&x,1,2,out); significantly slower than fputc(x>>8,out) and then fputc(x&0xff,out); - glibc 2.37-15.1. some afterthought: maybe not so weird after all.
Hoping to get the very first draft of my whole book printed today. With all the changes to reducing the font and diagram size (to 9pt) it's 110 pages now.
USB-C charging ports = planned obsolescence.
Copilot-generated spaghetti left unchecked does almost as much damage to a codebase as an experienced Haskell programmer.
The reason why we patch most buffer overflow vulnerabilities is not because they're a potential RCE. You can't reliably exploit most of these bugs to get a RCE. The real reason why they're fixed is that they provide surface for a DoS attack. There's negligible difference between a heap buffer overflow leading into a segfault and a panic!("Out of bounds.").
Bzip3 running on my boyfriend's PC-98 (486, 66MHz). It seems to work when the block size is set to 1MB and compresses at 22KB/s. The performance can likely be improved.

Feelings on GregTech New Horizons so far (got only to around midway LV, may not be representative of the modpack as a whole):
