Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror

Comment Re:Mostly useless for normal users (Score 1) 65

The point of the absurdly large model is to distill the logits to smaller models. Overparameterization makes it much easier to learn the underlying function. Once the underlying function is learned, a drastically smaller model can learn the output distribution (teacher student distillation).

Comment Re:I think you completely missed the point (Score 1) 98

This is been pretty well studied and extensive and large amounts of homework are counterproductive. It's just not how human beings learn.

Benefits of homework depend on volume completed (as opposed to assigned) and age (pre middle school have little benefit, middle school benefit from 1 hr, high school 1.5-2.5 hours) and whether the student actually does it (as opposed to copy, etc.)

https://ancillary-proxy.atarimworker.io?url=https%3A%2F%2Fwww.readingrockets.org...

As volume increases, students are more likely to skip it or cheat.

Comment Re:Crazy idea (Score 1) 509

That's an American problem. Funny given that the tech is American. But then the US has always lagged behind with contactless payments.

Most of the world has widely used public transport that use contactless payment, so it is a fairly obvious transition in most countries. The US lacks decent public transport for most people, and thus they have no experience with contactless payment for public transit, hence the slow adoption.

Comment Re:I assume they mean the webapp. (Score 1) 37

Since a substantial percentage of the tokens are from Chinese media and sources, it will have the same biases seen in those sources. Just as sources trained on US media will have a US bias.

Of course to the extent that Chinese media publish propaganda the model will learn that propaganda (similar to US media publishing of propaganda), sometimes the US reporting will eventually correct the propaganda though most media don't bother.

As to deliberate censorship - the model itself appears to not be censored but rather when serving the model they have another model that scans the input and output and terminates those responses that are to be censored.

Comment Re:Oh the glorious missed first post. (Score 1) 118

So I'll add a question. Could they try to defend their IP in courts against downstream consumption by other models, while simultaneously ignoring their hypocrisy? I mean... of course they can. I withdraw my question.

They aren't claiming a copyright violation, they are claiming a contractual violation - that Deepseek violated the terms of use of their API by allegedly using the API to generate training samples.

People who don't like OpenAI are trying to claim copyright violations that OpenAI 'stole' copyrighted works. Under US law there is 'transformative fair use' - and machine learning models are pretty clearly transformative. So they aren't really the same thing, since in this case copyright isn't being asserted, just a contractual violation.

Now, if a third party had generated the content with the API and published the generated content. Then DeepSeek downloaded and trained on that content, then neither the third party nor DeepSeek would have done copyright violation nor Terms of Use violation.

Comment It almost certainly from an 'exotic animal' farm (Score 1) 196

Just as influenza in the west is usually transfer from birds via poop or saliva to livestock (via the bird dropping food or pooping into feed or water) which infects the livestock, and then the extreme frequency of interactions of ranchers or farmers with the livestock (during feeding, slaughter, etc.) results in the transfer to humans.

LIkewise a bat likely pooped or dropped food into the feed troughs or water troughs of an 'exotic animal' farm and the similar high rate of interaction with farmers/ranchers resulted in COVID transfer.

It is those extremely high rates of human animal interactions that provide an opportunity for exposure to enough mutation variants that a tranfer to humans can occur.

The number of interactions of a worker with an infected sample in a lab in such a way that they could inhale or transfer hand to mouth are so extremely low that the 'escape from research facility' theory is rather absurd.

Comment Re:It's not AGI if it needs an expert for handhold (Score 1) 55

Did you miss the point? Did they just redefine AGI in terms of sales, so it's just a matter of how much lying they can do to persuade you they have AGI?

No, you missed the point. They can discover what should be declared AGI and delay determining as such until they have 100 billion in profits, and profits can be prevented by simply investing more money into hardware and acquisitions. So they can delay declaring AGI indefinitely.

The for-profit subsidiary is redefining terms so that the non-profit violates their charter.

Comment misleading intro (Score 5, Insightful) 132

The fact that schools were taught in english provided a skilled workforce, and COVID-19 suppression of tourism caused massive job loss resulting in the outmigration to countries with jobs and good wages.

It is the tourism job dependence that caused the loss of population, not the 'happiness index' focus.

Comment Epstein and Gates (Score 1) 176

The entirety of the relationship was Epstein discovering Gates had an affair with a mutual acquaintance and Epstein trying to blackmail Gates into being a founding member and donor of an Epstein 'charity'.

So yes it was bad that Gates had an affair, but no there was nothing sordid other than the affair.

Comment Tax treatment of software development R&E (Score 1) 64

This is likely in part a result of changes in tax treatment of research and expenditure for software development,

While the law will cause headaches for many industries, the software industry is particularly impacted. The TCJA added IR.C Sec. 174(c)(3), which explicitly states that any expenses incurred in connection with software development must be treated as an R&E expenditure (capital asset) and therefore amortized.

Previously, software development expenses were not explicitly under the purview of IRC Sec. 174. Instead, Rev. Proc. 2000-50 provided that software development expenditures âoein many respects so closely resemble the kind of R&E expendituresâ that fall under IRC Sec. 174 that it warranted âoesimilar accounting treatment.â Thus, a taxpayer could elect to deduct 100% of these expenses under IRC Sec. 174, amortize the expenses over a five-year period, or amortize the expenses over a three-year period under IRC Sec. 167(f). However, software development expenses did not fall under the definition of âoeresearch and development expendituresâ under IRC Sec. 174. The TCJA changed this by formally defining software development expenditures as an R&E expenditure subject to IRC Sec. 174.

The impact of this change on software and technology companies cannot be overstated. Many companies have had their taxable income increase dramatically because they can no longer deduct expenses. In fact, some companies have gone from being unprofitable to profitable and liable for federal and state taxes due to the change.

https://ancillary-proxy.atarimworker.io?url=https%3A%2F%2Fwww.eisneramper.com%2Fin...

Comment Did they prove it was training data? (Score 2) 73

It doesn't look like it emitted 'training data' - but rather random strings that matches personally identifiable information.

If you have ever tried to generate a new email address you will frequently encounter your chosen username already existing. Same thing - a LLM trained to generate 'email like' strings, will naturally generate random strings that collide with real addresses even if the address didn't exist in the training data.

Slashdot Top Deals

God may be subtle, but he isn't plain mean. -- Albert Einstein

Working...