Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror

Comment Re:Better Results (Score 4, Informative) 228

I know this is an alien concept to most people here, but it would be nice if people would actually, you know, read the papers first? I know nobody does this, but, could people at least try?

First off, this isn't peer reviewed. So it's not "actual, careful research", it's "not yet analyzed to determine whether it's decent research".

Secondly, despite what they call it, they're not dealing with LLMs at all. They're dealing with Transformers, but in no way does it have anything to do with "language", unless you think language is repeated mathematical transforms on random letters.

It also has nothing to do with "large". Their model that most of the paper is based on is minuscule, with 4 layers, 32 hidden dimensions, and 4 attention heads. A typical large frontier LLM has maybe 128 layers, >10k hidden dimensions, and upwards of 100 or so attention heads.

So right off the bat, this has nothing to do with "large language models". It is a test on a toy version of the underlying tech.

Let us continue: "During the inference time, we set the temperature to 1e-5." This is a bizarrely low temperature for a LLM. Might as well set it to zero. I wonder if they have a justification for this? I don't see it in the paper. Temperatures this low tend to show no creativity and get stuck in loops, at least with "normal" LLMs.

They train it with 456976 samples, which is.... not a lot. Memorization is learned quickly in LLMs, while generalization is learned very slowly (see e.g. papers on "grokking").

Now here's what they're actually doing. They have two types of symbol transformations: rotation (for example, ROT("APPLE", 1) = "BQQMF") and cyclic shifts (for example, CYC("APPLE", 1) = "EAPPL".

For the in-domain tests, they'll say train on ROT, and test with ROT. It scores 100% on these. It scores near-zero on the others:

Composition (CMP): They train on a mix of two-step tasks: ROT followed by ROT; ROT followed by CYC; and CYC followed by ROT. They then test with CYC followed by CYC. They believe that the model should have figured out what CYC is doing on its own and be able to apply CYC twice on its own.

Partial Out-of-Distribution (POOD): They train on simply ROT followed by ROT. They then task it to perform ROT followed by CYC. To repeat: it was never traiend to do CYC.

Out-of-Distribution (OOD): They train simply on ROT followed by ROT They then task it to do CYC followed by CYC. Once again, it was never trained to do CYC.

The latter two seem like grossly unfair tests. Basically, they want this tiny toy model with a "brain" smaller than a dust mite's to zero-shot an example it's had no training on just by seeing one example in its prompt. That's just not going to happen, and it's stupid to think it's going to happen.

Re, their CMP example: the easiest way for the (minuscule) model to learn it isn't to try to deduce what ROT and CYC mean individually; it's to learn what ROT-ROT does, what ROT-CYC does, and what CYC-ROT does. It doesn't have the "brainpower", not was it trained to, to "mull over" these problems (nor does it have any preexisting knowledge about what a "rotation" or a "cycle" is); it's just learning: problem 1 takes 2 parameters and I need to do an an offset based on the sum of these two parameters. Problem 2... etc.

The paper draws way too strong of conclusions from its premise. They do zero attempt to actually insert any probes in their model to see what their model is actually doing (ala Anthropic). And it's a Karen's Rule violation (making strong assertions about model performance vs. humans without actually running any human controls).

The ability to zero-shot is not some innate behavior; it is a learned behavior. Actual LLMs can readily zero-shot these problems. And by contrast, a human baby who has never been exposed to concepts like cyclic or rotational transformation of symbols could not. One of the classic hard problems is how to communicate with an alien intellect - if we got a message from aliens, how could we understand it? If we wanted to send one to them, how could we get them to understand it? Zero-shotting communicative intent requires a common frame of reference to build off of.

Comment Re:Breaking news (Score 1) 220

100% this. I'm a vegetarian, the sort of person they think should be buying their products, but their products disgust me, because they remind me of meat. I don't want to be reminded of an animal corpse while I'm trying to enjoy a tasty meal. Why mimic the thing I don't want to eat?

(I'll only speak re: vegetarians below, but I expect vegans are similar)

I would ask non-vegetarians: imagine that you live in a world where people ate toddler meat. Real human toddlers, slaughtered for their meat. The vast majority of people in your situation would of course avoid eating them. Some may be radical anti-toddler-meat campaigners. Others may silently accept that they're not going to change the rest of the world. Either way, you let people know that you don't eat toddler meat so they don't serve it to you. But hey, some friendly cooks feel bad that you don't get to enjoy toddler meat! So they make a baby meat substitute that looks and tastes exactly like toddler meat! They package it in packages with pictures of dead toddlers on it, but with labels "No toddler included!" And then they expectantly wait for you to thank them and praise them for finally making toddler meat that you can eat - rather than being disgusted by the whole concept and wanting some non-baby related food that doesn't make you think about dead toddlers while you eat.

That's not the situation *all* vegetarians are in, but it is the situation that a *lot*, dare I say most, vegetarians are in.

I think a lot of non-vegetarians cooking for vegetarians are just frankly confused about what to offer us, as they have trouble picturing of a meal without meat. It's really simple: you know how fatty, salty, carby, umami-rich stuff tastes really really good and leaves you feeling satiated? Yeah, just make something that's fatty, salty, carby, and umami-rich that doesn't involve meat, and your vegetarian friends will be happy ;) Like, pick anything carby, cook it with a tasty fat that pairs well with it, add salt and something umami-rich (things like mushrooms, nuts, tomatoes, yeast, nori, olives, nuts, cheese (vegetarians only), spices, etc etc), and viola, that's a good vegetarian dish ;)

  Of course, *to make it healthier*, and a more "adult" taste, you'll want to include non-carby veggies (which, per unit *dry mass*, are actually highly protein rich, with e.g. freeze-dried broccoli being well more protein rich than your average grade of ground beef without its water, and freeze-dried watercress being up there with fish - they're just heavily watered down). Veggies also add umami. You can also - optionally, but it's not at all a requirement - include high-protein things like tofu, tempeh, seitan, TVP, etc. But protein deficiency is not common among vegetarians or vegans in western society (the main risk is iron deficiency, particularly for vegans, and - exclusively for vegans - B12 deficiency, but only if they don't eat anything fortified with B12, though B12 fortification is common).

Comment Re:80 to 100 years (Score 1) 58

Am I the only person getting whiplash that we're rediscussing the exact same thing when this concept was already proposed as Breakthrough Starshot, and was big in the press at the time, incl. on this site?

Anyway, you still have to have the energy to transmit back, which was proposed to be from a RTG: My hot take: since you already have to have a (comparably) big sail anyway, which means booms to maintain its structure, use 232U to get many times the energy per gram as 238Pu (38,9 MeV vs. 5,6 MeV), at the cost of a hard gamma from 208Tl, and put it out on the ends of the booms, with appropriately radiation-hardened electronics. You could also do double-duty with an alpha sail (alpha emitter backed by a thin low-Z material to create net thrust from alpha emission)

Also, if the 232U is in the form of a compound that's sufficiently soft for fast diffusion at its equilibrium temperature, you can diffuse out the 220Rn and avoid 208Tl's hard gamma altogether. This costs you (you only capture 16,6 MeV of alphas), but not only does it avoid the hard gamma, but it also means that you don't retain the mass of the stable decay products, so your craft gets significantly lighter over time. (At one point I was considering urania aerogels to lose the radon instead of "soft" high-diffusion materials, but the data was suggesting that the aerogel would quite quickly self-pulverize and densify)

232U is readily producible (indeed, it's a waste product in thorium reactors); main issue is just that it's a pain to handle. But for something like this, you're dealing with microscopic amounts.

Comment Hour of mediocre (Score 4, Insightful) 34

Coding arguably requires talent - although Microsoft has been proving consistently for half a century that you can be a successful software company with piss-poor engineers.

But even if AI produced perfect code, then producing software essentially requires no talent. I'm not saying it's a bad thing in itself, but it moves the act of producing software squarely into the realm of everyday mediocre accessible to everyday talentless people.

And on top of that, the fallacy is that AI simply doesn't produce anywhere near anything that resembles perfect code. But of course, Microsoft is desperate to have you believe otherwise...

I'll just say this: I'm glad I'm at the end of my career as a software engineer, because I didn't spend a lifetime honing my skills to end up a mediocre types-question-guy.

Comment Re:The only thing smart-anything things do is (Score 3, Interesting) 58

The stress level measurement is what the smartwatch pretends to supply you - a feature that entices you to purchase the watch, if you're interested in knowing your stress level.

What's being monetized is the raw data - accelerometer measurements, O2, location... whatever the hell those things measure to do what they pretend to do - because a lot of really invasive and personal information can be inferred from those measurements.

Comment Re:There should be an easy natural observation (Score 4, Interesting) 70

The least-harm principle. There's essentially universal agreement that low (dietary-range) levels of lithium are not harmful, while the research as a whole is strongly suggestive of a benefit (but has not yet met the standards of, for example. an EPA regulatory standard for lithium in drinking water). Lithium, at the doses necessary, costs basically nothing, takes seconds to take, and is orders of magnitude away from the levels where potential toxicity symptoms can arise. To me, that's an easy call. Also, Alzheimer's runs in my family, so there's an extra factor weighing on the scale.

Comment Re: Didn't we know this a decade ago? (Score 1) 70

Nothing weird about sodium fluoride, fluorosilicic acid, or sodium fluorosilicate. Sodium fluoride is a simple salt, dissociates immediately upon dissolution to Na+ and F-. Fluorosilicic acid and sodium fluorisilicate result in a fluorosilicate ion (SiF-2) which rapidly hydrolyzes to Si(OH)4 + 6F- + 4H+. Si(OH)4 (orthosilicic acid) is the form of soluble silicon which plants and diatoms consume and is perfectly normal in water in the tiny amounts from fluoridation (like 6 micromolar concentration). Ocean surface water near Antarctica for example is up to ~80 micromolar concentration. And it goes without saying that minuscule amounts of sodium in water are also perfectly normal. The addition of the fluoride ion is the only actually meaningful impact.

Slashdot Top Deals

Surprise your boss. Get to work on time.

Working...