Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror

Comment Crazy, damaged thinking is worse than deception (Score 2) 47

Actually, 'hallucinations' sounds worse because it implies a disconnect from reality, which is quite different from simply 'making sh*t up' or lying. LLMs can intentionally deceive, but hallucinations are an unfortunate side effect of their design. While creating deceitful responses would require a level of intelligence and intent, hallucinations stem from flawed processing and can reflect more erratic or damaged thinking. So, when you put it that way, 'hallucinations' doesn't really sound better at all. Calling it making sh*t up or lying gives them more credit than they deserve.

Comment No. (Score 4, Insightful) 128

Unsympathetic start with the sob story of a Zach Yadegari who could not find two more points on the ACT to get into an Ivy league college. Nor could he afford to spend part of the $30 million a year his start-up earns on a proof read of his essay. Maybe he is better off without the Ivy leagues, maybe they are better off without him. But at least now he has a victim story to write when he reapplies next year.

The whole argument that the affluent have an advantage, so lets throw it out, is silly. Sorry the affluent do have an advantage that will help them long after the tutoring they got for the ACTs, their 4.0 GPA, and help writing the college essay. They will have better support through college, no matter where they go, and they will have better support after college in being successful. Their ability and the ability of those around them that help them leverage their affluent advantage will matter. If I am investing in a student to go through college, well then I want the biggest bang for my buck. So whether I am a for profit college, and parent, or a state, I want the best product from the college education. If the college essay helps reflect those best positioned to "market" themselves for success, well then like it or not it is a great metric. Instead of trying to throw out every measuring stick you do not like, or cut everyone off at the knees so we are all the same height; try finding a way to help the less affluent write a better essay, get supported through college, be better.

If the essay is a bad measuring stick for success, then sure throw it out; or refine the judgment of it. But I am not moved. If the college wants to use it to get around diversity restrictions, that is fine with me; it is up to the colleges. If the colleges admit a bunch of sob stories that flunk out or do not perform, the colleges that pick up the good students will start to rightly thrive. I was always told it worked two ways:

#1 You have two students, or two hundred with identical #1 in their graduating class, 4.0 GPAs, 36 on the ACT...the objective measurements are limited and say we have 200 equal students. The worry is there is something not measured. The hope is that an essay or a letter of recommendation could tease out that difference.

#2 You have a student with a 3.5 GPA, #2 in the graduating class, 32 on the ACT...but a great story of how there are only 10 people in their high school graduating class, they overcame cancer, and lost both parents last year...so the objective measurements may not accurately reflect what they are capable of.

Comment Intellectual Property wants to be free! (Score 2) 14

Exactly - no IP laws here would mean you can kiss things like the movie/content industry (and all its jobs) goodbye; there won't be any incentive to produce anything if you can't secure rights to it.

I am optimistic that the destruction of IP laws would be a good thing.

The first issue is who IP is meant to benefit: the creator or the public? I will assert that the answer is the public. IP is an artificial protection meant to encourage creators in an effort to benefit the public; so it is meant to be mutually beneficial, but ultimately for the public.

The biggest industries have the weakest IP protections: food, clothing, housing. Now it might be slightly disingenuous to draw a parallels between food and say movies; everyone needs to eat, no one needs to watch movies. However, no one needs to eat at a Michelin Star restaurant, or wear designer clothing; and yet fine dinning and designer clothing make some people very wealthy. My assertion being that weak IP protection has not destroyed excessive profit making capabilities in those basic industries.

However, I foresee an alternative model for something like movies or drugs, if copyright and patents are abolished. Perhaps the Kickstarter model would be my hope. One where an artist pitches an idea for a movie, and backers, big and small, would contribute enough to sponsor the creation of a script, then perhaps a "first episode or scene" and so on. Known writers, directors, actors who ask for sponsorship would hopefully be able to demand more capital up front. The idea being that everyone is essentially paid for their work before hand, or at least the funds are in escrow. Drug production would be funded by those who believe in a lab, company, or group of people who might be able to produce the desired drug, and the resulting product patent free for all with the researchers already paid.

As a result Trademarks, or at least some way of verifying a creator's identity would still be very useful for the public. I want to know the clothing I am buying is say "Levi", and not a Chinese knock off that will fall apart quicker. I want to make sure I am sponsoring the writer that wrote my favorite book or movie script for the next work I want to see created.

Comment Re: Maybe these people... (Score 1) 232

I agree that "net worthy" is not an ideal value of a person.
However, I am not sure we have a better way to value people at the moment in the global market.
Fortunately, it seems we are mostly in a post scarcity world, where there should be enough shelter, air, water, food, and productive social things to do for everyone in the world.
Unfortunately, location, due to various factors, might limit some people in the world from enjoying all those benefits, but I would assert without provide much evidence sorry, that we are better off now then most of human history in providing for people's needs.
To further stir the pot I will also claim this is due to capitalism, or essentially valuing people by their earning power and monetary wealth.
But I do confess, it seems we can do better, and the "system" is not without corruption; but still better than other systems attempted.
All of this is a wonderful debate probably too much for this forum, but I would love to hear proposals for other ways of assessing the value of a person; if that is appropriate at all.

All that aside I was more motivated by the Taylor Swift vs Elon vs 10,000 builder assertion in your comment.

I am not much of a Taylor Swift fan, nor have I ever really gotten to excited about a performing artist of any form to the level that "Swifties" appear to adore Ms. Swift. However I do confess I have song along to a few of her songs with the radio and have enjoyed them. The "Swifties" obviously value Taylor, and their enjoyment perhaps enhances their lives and makes them more productive, all kinds of positive things. But I agree if she dies tomorrow not many will be worse off, a few days of mourning, and her songs will still play just as well on Spotify and a new singer or ten will fill the void. Nothing of value will really have been lost.

But I am not convinced the same with Elon. I understand hating Elon is popular these days, but I am not convinced of that either. However, it does seem that he is more than a one trick pony, and I believe it can be argued that had Elon died twenty years ago the world would be a different place. I cannot say the same for Taylor Swift. I have had the pleasure of driving a Tesla a few times, and love it, have not spent the money on one yet, but would love to own one. I am not sure on the numbers, but I am not sure I could point to another car company that sprung to life only in the 21st century and has had such success. The Ford, GM, Dodge, Honda, Toyota, BMW, Volkswagen, Mercedes....are all pushing 80 to 120 years old, kings, and Elon's Tesla disrupted the market in what I would say is a positive way. I have not found another EV I would want from one of those old school manufacturers yet.

Beyond Tesla, I want to address the assertion that Elon did not develop the Space X rocket. I recall seeing a video interviewing a rocket scientist or something, talking about Elon getting the books and actually learning and proposing designs for the rockets. Now without the 10,000 engineers I am sure he would not have gotten out of orbit. Not to mention Space X appears to be the leader vs Blue Origin, Virgin Galactic and other commercial ventures.

I am sure there were other internet satellite systems out there before Starlink, but I had never heard much of them before.

I am an LLM fan, and I understand that Elon deserves a ton of credit for forming OpenAI. My understanding is that Google made the breakthrough with transformers in 2012, but OpenAI, a company that did not exist until 3 years later was created and beat Google and every one else to the punch.

Electric vehicles, commercial space travel, satellite internet, LLMs would probably all be a thing to some extent had Elon never existed. But I am not sure that they would be as good today or accessible in a world without him. I am not convinced that Elon is simply a rich guy making money off the backs of a bunch people actually doing work.

Now I am not convinced Elon is quite right on Bitcoin after hearing him talk on that matter. Plus it really seems questionable to have bought Twitter. Plus Sam Altman ran off with OpenAI and upset Elon. But you had to cite "10 thousand builders" dying to make an overwhelming disruption compared to Elon, not one builder, not ten, a hundred, or even a thousand, but ten thousand people's death is what it might take to disrupt the world vs one man, or maybe that comparison was aimed at the value of Taylor Swift...I am not sure ten thousand is high enough actually, we may find that ten thousand electricians, plumbers, carpenters leave the work force in a year due to retirement and death, and the world keeps progressing forward with another ten thousand or twenty thousand to replace them without missing a beat.

That is a lot of rambling, but I feel better now; thanks for reading this far.

Comment Re:So let me get this straight... (Score 1) 103

You don't need permission from the copyright holder to stream from youtube.

Copyright is all about permission. The holder has at best given Youtube permission to stream the copyrighted data.

And youtube itself is giving implicate permission to stream from them by serving up the stream. "If you continue using this service" TOS shrink wraps are not valid.

I am skeptical that Youtube serving up a stream is at all implicate permission in terms of copyright.

The Terms of Service are at minimum in place so that Youtube can point to them if the copyright holder comes complaining about something like Musi and Youtube can claim they have done their due diligence.

They signed over this control when they allowed it to be streamed on youtube. If the copyright holders have a grievance, it's with youtube.

They signed over the right for Youtube to stream with certain monetary kickbacks I am sure. I agree that the copyright holder would probably target Youtube for weakly protecting their stream, or rather claim that Musi streams should count toward the monetary kick back they get in spite of the application accessing the stream

Of course this is Apple cutting off Musi from the App Store, not Google or the copyright holders taking any actions. I am sure Google may have something to do with this, perhaps pressure on Apple. Because in the end Musi cuts into Youtube profits two ways, one by potentially depriving them of the ability to display and ad and collect revenue on the stream, and two the stream costs money to provide in server and bandwidth with no income for their service from Musi or Musi users. Apple runs a tight clean ship, sometimes that is a perk of being in their ecosystem, sometimes it sucks. I am sure an Android APK of Musi is out there for free to download and use. If Musi really cared about their users they could release the swift source code and users could compile it for their iPhone, I do not think Apple blocks that...

I am generally opposed to copyright, and I could not care less as to the copyright holders being potentially deprived of revenue from Musi users choosing to use Music and bypass Youtube advertisement or subscription income. The copyright holders should not have released their content until receiving compensation, once released I do not support their claim for any control over the content. However, I do have a qualm with bypassing Youtube's revenue for serving up the stream; that is just not sustainable, if enough people did that then it for sure goes away as Google cannot afford to serve streams up for free forever. I would support Musi ripping the stream and serving the content up, as then Musi would have to pay for the bandwidth and servers and find income to do so. But that is on Youtube to secure their streams.

Comment Re:So let me get this straight... (Score 1) 103

I would guess that Youtube has an agreement with the copyright holder and is not violating the copyright by streaming the data. But Musi does not have permission and is bypassing the Youtube interface, violating the copyright holder's claim to be able to control the streaming of the data?

From a practical stand point, Youtube sells ad space in its app that pays for the servers and compensates the copyright holder?

Comment Re:It's not piracy that supports organized crime (Score 1) 149

I think you bring up the prime issues of copyright violation on the output.
I have heard the argument that potentially the person prompting the AI is liable for the copyright violation? The AI is treated as a tool, like a word processor, you typed in something and got out copyrighted material, that is your fault?

I like the idea of treating the AI services such as ChatGPT legally as a black box. Inside the black box is a person that may or may not have a computer running some software, either way the person in the black box is liable for any copyright violations coming out of the black box.
If the person in the black box brings in a bunch of copyrighted material, whatever he does with the material in the black box is his business, whatever that person distributes out of the black box they are liable for the copyright violations.

If the person in the black box releases a tool, or his "notes" on the copyright material, the model in the case of the Large Language Model (LLM) AI, then the model would have to be evaluated for copyright infringement.
I believe the structure of the LLM positions it nicely for arguing the output is statistical facts, not simply a derivative work well within fair use tiny experts of copyrighted text at worst.

I would point to Google Scholar as perhaps the best example of this black box with the worst potential offender in terms of copyright.
Google Scholar has definitely been controversial, but still persists.
I would assume in the black box Google has pushed to the limits on reproduction of copyrighted material, I am sure dumping their database would easily contain the copyright material in entirety.
However, Google is careful to present the search results in a fair-use minimal form from what I have seen, skirting or complying with copyright law just enough.

I agree, copyright holders smell money and will try and pursue it. I hope that the law will see AI as more than derivative work, not a reproduction, and the product worthy of protection from such cash grabs in the same way that search engines have been protected.
But I would also toss out copyright and force creators to simply collect the money up front through a system like kickstarter where they pitch their idea, hopefully get enough pledge money to justify producing the idea, then release it into the world already paid for free to be used by all.

Comment Re:It's not piracy that supports organized crime (Score 1) 149

I am glad posts like this are now so weak they hide behind anonymous coward.

I have not been able to get a straight answer as to why anyone thinks AI training is copyright violation.
AI training relies on the same legal position as search engines rely on concerning copyright: Fair Use Doctrine, Public Access and Consent, Transformative Use.

However, drawing a parallel between the youth "learning" from material distributed by someone violating copyright and AI training would infer that the material that the AI were trained on was sourced from an entity that did not have the right to distribute the copyrighted material. I would say neither the youth nor the company training the AI are responsible for the copyright, since they are not doing the copying but instead at worst receiving a copy that was distributed without the permission of the copyright holder; in that case the distributor of the copyright material is the one responsible. In this case the "pirate" site. Of course using torrents and sharing data at the same time as downloading it may be more than a gray area.

But back to the parallel between the AI training and the youth training. If the youth answered a question on the internet they learned from some copyrighted material, I would say that falls under fair use. I have not seen anyone argue against someone making a statement about a movie or a book on the internet violating copyright; unless they were pasting large portions of the copyrighted material in the forum without permission. In the same way, AI, appears to be no different and following in a very established practice of learning and sharing.

Comment Re:The dirty secret of LLMs is the training data (Score 1) 38

I appreciate your response.
It is interesting to assert that archive.org is a strawman, and I am more convinced then ever it is not; at minimum it helps narrow my understanding of your initial post.
Perhaps it is because I mostly use ChatGPT in place of Google now, it is simply a search engine to me with a better interface.
In addition, the Internet Archive is the most extreme example, it serves up in whole entire web sites; Google no longer does that (miss the cached results), only in part, and ChatGPT and the like do not seem to do that unless you ask it to.
So if anyone was violating copyright, it is Internet Archive, and I am glad there is an exception for that, or that it is protected under certain doctrines.

The distinctions you make are that search engines are doing "simple" search and not paywalled.
So LLMs are doing more than simple search, or maybe complicated search, and are paywalled.
If the LLMs remove the paywall, open models, then it is the "simple" search aspect that we are left to address.

I was not aware that were any "search" exceptions in any legal framework, and I cannot find any specific ones.
It seems that search engines rely on the following three legal positions for copyright: Fair Use Doctrine, Public Access and Consent, Transformative Use.
I will assert that LLMs can lean on the same legal position, since those positions do not declare some special specific "search" definition.

I am not convinced that there is a legal distinction for simple search in fair use; beyond that I would say that LLMs are even more protected due to their transformative use of the data in question.
In addition, I would assert that the Internet Archive, Google, Microsoft, Duck Duck Go, and the like all profit from storing copyright material in whole, and delivering it in whole without permission
While they may not have paywalls in place, simply receiving compensation for delivering the copy they did not create seems it would already invalidate fair use; but thankfully it does not.

You seem to feel strongly about the position that what the creators of large language models so is illegal.
If a law was created that simply said compiling and using data sets that use copyrighted material in part and whole to train large language models is legal or there is at least an exception in the law, would you change your stance?
Or do you have a deeper moral or ethical opposition?

Your signature is presented even when the topic is not LLM related, and is a bit negative on LLMs.
Why the hate? Not that the signature is wrong, I think it is accurate; along with the last part of your initial post: the training data is mostly crap I fear.

But I see a lot of people jumping out with your argument that the creators of LLMs are stealing and violating copyright.
Then I see a lot of people confused by this like MpVpRb who either do not respect copyright at all anyway or do not see the way LLMs use the data as being an issue.
I figure your camp are creators or some threatened party who believe the way the make a living is going to be threatened by LLMs, so they most scream loudly to try and stop it.
I confess I fall in the later party with MpVpRb, and I am looking to understand the other perspective; especially the strong emotion behind it.
I am optimistic that you can provide that answer.

Comment Re:The dirty secret of LLMs is the training data (Score 1) 38

Before I ramble on, I am curious what gweihir, do you condemn arvhive.org the The Internet Archive Wayback Machine in the same way?
It seems they more than anyone "take it, sell it, do whatever [they] want with it"; and I think they are great and do not want them to stop!

I have never been convinced that large language models violate copyright.
I would say that "public" is at best only restricted to large exact copies, and maybe only then an attribution is needed to be kosher.

I learned to write essentially from the example of other writing (perhaps not very well, but that is another matter).
It seems to me that large language models do the same, they "learn" sentence structure and words by reading other people's writing; and as a result they can create "unique" sentences, or string words together in a way we can comprehend that may not have been done exactly in that same way before.

My understanding is the large language models are weights for words and vectors for connecting words or strings of letters.
I do not believe that any of the training data exists in whole or really in any significant part in the models.
I do believe you can write something to use the model to obtain parts of the training data perhaps, and that would be the closest to condemning the model for copyright violation.
I think that all goes along with your signature.

I see the models as serving a primary purpose of interrupting my questions and forming a coherent response.
But I would appreciate attribution for ideas, concepts, quotes to the original source; not necessarily because I care to credit or pay that source, but more so that I can selfishly to use that to vet the responses validity.

It seems that people are confusing large language models with search engines; as I imagine a search engine like Google has to have perfect copies of everything they provide a search for. This seems to be apparent simply by running a search and Google provides an excerpt of a web page that matches my query.

Now people do whine about search engines "stealing" data as well; like the dumb news sites that want to cut search engines and social networks off then do not realize they are really free marketing that the news site has not been paying for.

I am still waiting for the open search engine, where a nice "rolling" or incrementally updated torrent serves a database of the entire internet, and perhaps beyond with books and the like that I can search on locally based on my own algorithm choices. But I am sure that torrent and the database file would come under fire for copyright violations as well...so it can just be hosted on Pirate Bay I guess.

Comment Re:Levels of Open (Score 2) 38

Well, I finally took the time to skim the summary....

It really feels like the word "Open" is lost, at least for AI; given OpenAI is not fully Open from what I understand.
If the Open Source Initiative (OSI), "a long-running institution aiming to define and “steward” all things open source", does not "properly" define Open AI as having an open data set, then perhaps it is time to move beyond "Open" and cut the legs out from underneath a seemingly corrupted organization.
I like the word Libre, how about Libre AI?
Then we start dragging the word open through the mud until anyone using it looks like a lying con artist.

Comment Levels of Open (Score 2) 38

I of course asked ChatGPT what key components make up the creation and use of a model:
Training Data, Preprocessing and Data Pipeline, Training Configuration, Training Script, Model Checkpoints, Base Model (if applicable), Fine-tuning / Specialized Training, Trained Model, Inference Code, Deployment Pipeline, Evaluation and Testing Metrics, Post-processing.

I would say Training Data has been the most controversial aspect of AI creation, followed by censoring that may take place in a handful of the steps from the training Training Dat to the Post-Processing.

I understand that Training is the most "expensive" part of the process process, and at best most of us can really only make use of the end model.

It would be awesome to have access to all the training data, adjust it how I would want to, tune the configuration for the training, and generate my own model; and I would say to be labeled a full open AI then all these things should be available. However, I would be happy to at minimum have the model and control of the inference code settings and post processing and still say it is sort of Open AI.

I am not a believer in copyright or censorship; but I think I understand why the exist and how we can live without them. So I am not afraid of the negatives of open data sets or the results of unrestrained AI; I think it will be great, and mildly painful as we adjust to the new better reality!

Comment Re: I'm going the (Score 1) 177

Sorry I woke up on the wrong side of the bed and aspects of your comment pissed me off way more than they should have.
I am sure you are a nice person Bobby, so forgive my rant; you are not all wrong, making some good points, but you have some negative view points that I can understand but want to stomp out of myself and others.

It irritates me, maybe even angers me, that a company like Microsoft gets away with selling such horribly broken products. Even with years and years of sometimes hundreds of patches, it's still broken.

I have no serious issues with Windows 10, it is not horribly broken, nor has it ever been.

There may be "features" I wish Microsoft did not insert, but I choose my Windows 10 machine for most tasks over Mac OS and Linux machines because it works better for me; I hate that fact, I want to go all Linux and kiss proprietary closed big brother behind, but I have not managed because it is just the best. In addition I have not managed to convert my wife, parents, or much anyone away from Windows to Mac OS or Linux. I do not believe it is just because it is the devil they are familiar with, although that carries weight, but because Microsoft has done a great job.
I would love to toss away everything Microsoft for some open source Linux desktop, but I have not managed that...yet

Microsoft supporting their product for years and year with hundreds of patches is not a negative at all, it is a testament to how great an investment the Windows 10 license has been. I cannot name any other company outside OS vendors that come and upgrade something I bought 5 years ago from them for free.
I do not see it as a warranty covered defect either, the Windows 10 install I did 5 years ago ran great, and without updates would still run great, just like your fairly patched Win7. But there have been some performance improvements, notably the Windows File Explorer patch that I did not know I needed.
I would love for Lexus to come back 5 years after I bought my RX and say oh yeah here is an engine boost. Sure they made a "mistake" and that is why my engine/explorer was not running as fast as it should have, but I did not notice.

I would be tempted to say hey Windows 10 has this gapping security hole? That was a mistake they made in 2015, and in 2024 some Russian found it and infected machines; then Microsoft release a patch. Is equivalent to Microsoft sold me a safe, Russians figured out how to crack it 9 years later, and Microsoft released a fix to counter that crack. Sure it would be great if Microsoft had foreseen the issue, or maybe did not leave that buffer overflow issue there or security flaw there.

Is Microsoft innocent? Nah, supporting their product keeps it #1 in the marketplace, if they neglected it,. or charged to much for patches, then Mac OS or Linux would displace them. And I am sure some of the "patches" do things I would rather they did not add.

Which brings me to my second frustration: there should be some mechanism in place- markets, laws, something, to force Microsoft to completely finish a product before being allowed to introduce a new one.

This is the part that really pissed me off, the assertion that we need some law; maybe I am too libertarian.
This goes right along with the "right to repair" bullshit.
Just like everyone else I do not want to waste money on an "unfinished" product, and I want to be able to upgrade my battery or have an "unauthorized" repair, sorry ranting on the "right to repair" stuff to much you did not bring up...but I just hate the idea of forcing a company to do "right" with laws.
But you did mention market first to your credit! I do not think "laws" would get us anything but a minimal result that would probably hurt the market and most likely just be an illusion of helping, while hurting competition by raising the bar for entry.
No forcing Microsoft to do better, entice them to do better by not buying Windows 11, starving them of your money and the money of your friends and relatives.

Every new version of Windows brings many many unwanted "features", and hundreds of security holes and errors. How about refining and finishing a product?

Ten years of refinement on Windows 10 seems like an eternity for a software product, and that is after building on another 20 plus years before with previous versions and maintaining backward compatibility. I would argue that there is no more refined or finished OS out there, Mac OS and Linux fall short.

I like to think Win7 is fairly well patched. I'm in no danger. I use some good protection software; my systems are NEVER open to the 'net- always firewalled, and I make sure there are no open ports anyway. There are several malwares that can hide from a scanner anyway. I regularly pull the hard drive (SSD in this machine) and scan it as a slave drive in a known good machine so that no malware can infect and hide during boot. Happily, no malware. BTW, Windows Defender does still update.

No malware that you know of...this type of firewalled setup is just as prudent if not more for a more recent version of Windows or any other OS that has more market share and will have more zero day "external" vulnerabilities to come. I fear this sentiment leans to heavily on the old trope of setting up an unpatched Windows 2000 machine and connecting it to the internet and timing how long it takes for it to get pwned, all we need is a firewall and some virus scanner to protect it. The vulnerability is not from some Windows network stack hole anymore, it is from inside, most likely you, a zero day for your browser, or some software you let in to your system. Your firewall does not protect you from those inside threats, and your Windows Defender updates are probably still reactive. Eventually you will have the advantage of no one is targeting your old OS with all its known vulnerabilities because the market share is too low as well.

Sorry about my rant, thanks, going to find my coffee now...

Slashdot Top Deals

God made the integers; all else is the work of Man. -- Kronecker

Working...