chabotc - Slashdot User

Comment Re:Two things (Score 1) 25

by Alain Williams on Saturday June 14, 2025 @07:54PM (#65449949) Attached to: Increased Traffic from Web-Scraping AI Bots is Hard to Monetize

(require bots to honor robots.txt.

'robots.txt' was invented in 1994 - 31 years ago; it has done well but requirements are different these days, it needs extending for AI but not just AI.

* 'User-agent' means that a site needs to know all the names that spiders use to identify themselves. This is hard and cumbersome. 'Crawler' should be possible, values eg:

** 'web-index' - eg google to allow someone to search

** 'AI' - eg ChatGPT

There are prolly several others.

* 'purpose' what can the spider do with the information ? Values eg:

** 'full-index' something like google could keep and serve up the content

** 'show-amount n' like full-index but only show nbytes - this would mean that the user would have to visit the web site to real all of the text. This is something that news sites would like.

** 'train' Use for training AI

** 'fee $n' if this is downloaded a fee of $n must be paid. If the spider operator considers this too much then do not download it. This will have spider owners to wail and gnash their teeth

** 'rate-limit n' do not access the web site more than every n seconds

** 'view region' only view the content if the spider is in geographic region

OK: the above is a first draft and can be improved a lot but is a start. Enforcing this is another matter.

Comment How about AI vs Disney ? (Score 2) 100

by Alain Williams on Sunday June 08, 2025 @03:40AM (#65435403) Attached to: AI Firms Say They Can't Respect Copyright. But A Nonprofit's Researchers Just Built a Copyright-Respecting Dataset

Disney is ferocious in its protection of its copyright. What happens if you ask an AI about Mickey Mouse, what does it say ? How did that AI learn about MM but by reading copyrighted material or viewing copyrighted movies ?

Has Disney said anything about AI companies using its copyrighted material ?

Comment Re:"Respecting copyright" != "Ethically" (Score 1) 100

by Alain Williams on Sunday June 08, 2025 @03:26AM (#65435393) Attached to: AI Firms Say They Can't Respect Copyright. But A Nonprofit's Researchers Just Built a Copyright-Respecting Dataset

Let's just run with the "AI training" misconception.

There's a document. It was created by an author. The author has the exclusive right to copy the document in its entirety onto his own website (copy=1,violation=0). Your browser knocks on the door. It asks for the document. The author's website copies the document over the network onto your browser's process memory (copy=2,violation=0). That's fine, because the author's HTTP server initiated and the author intended to authorize the copy.

After that thing get murky, several copies are made but to what purpose ?

Web browser copies it into its cache on disk (used,eg, if you do a page refresh avoid downloading again over the Internet). Is this a legal copy ? This is a standard browser thing. Other similar copies might be made, eq by squid (a caching and forwarding HTTP web proxy). I will ignore these copies as no one seems to be upset about them. (Actually that is not entirely true.)

You read the document, another copy is made that resides in your brain's memory. Is this legal ? It is not mentioned in copyright legislation but it is implied as part of the intended use of the document so that you may learn whatever the document talks about. If a friend is sitting next to you I can see no legislation that prevents s/he from also reading it - thus several copies might be made.

What if, instead, the web browser does not present it to a human but to an artificial human (an AI) then a copy will be made in the AI's memory. It is this copy that is being objected to. What is the difference between a copy being held in grey matter and one in silicon ?

Or is the disagreement not how it is held but the use to which it will be put ? An AI will, presumably, be used for commercial gain, is that the problem ? But if I read a book about Python is that not commercial gain if I get a job as a Python programmer ?

Or is it that the AI might further disseminate the knowledge. We might be getting somewhere here as the Disney Corporation has objected to parents who sing their copyrighted "Happy Birthday" song at their kids' parties.

Whatever. The point is not the copies but the purpose of the copies. This is what needs to be discussed.

I think that we need an enhancement to robots.txt where the web site (copyright owner) can say to what purposes copies may be made. All mechanical readers (ie all but humans) would be obliged to obey it. This would add little overhead, indeed search engine spiders already do so. What uses: web indexing; AI learning; quoting of small sections; quoting of entire document; ... The list of different uses needs discussion. I have zero faith that the AI cowboys would take any heed - they are entitled and seem to think that the world owes themselves a living.

Comment Re: Learning your IDE is more effective ... (Score 1) 189

by Alain Williams on Saturday June 07, 2025 @04:53PM (#65434663) Attached to: Ask Slashdot: How Important Is It For Programmers to Learn Touch Typing?

He means that they have learned a few basic, simple Vim commands and then stopped learning. The result is that they can get the job done but it takes a lot more time. If you use a tool a lot continue learning how to use it better, the investment will repay you many times over.

Comment "Valued" or "Worth" ? (Score 2) 33

by Alain Williams on Friday June 06, 2025 @12:23PM (#65431986) Attached to: About 20% of Tech Startups Worth More Than $1 Billion Will Fail, Accel Says

These unicorns have been estimated to have a value of over $1B at an early stage in their life cycle when their income has been small - this valuation is based on speculation as to what might happen in the future. These companies will only be worth over $1B once they have a solid product that is selling well to many customers.

Comment Do you think that we could crowdfund ... (Score 1) 183

by Alain Williams on Tuesday June 03, 2025 @09:04AM (#65424227) Attached to: Trump Wants $1 Billion For Private-Sector-Led Mars Exploration

a few tickets for one way seats for Trump and some of his cronies; make the world a better place.

Comment What are the security implications ? (Score 2) 44

by Alain Williams on Friday May 30, 2025 @12:02PM (#65416719) Attached to: Gmail's AI Summaries Now Appear Automatically

OK: we know that google reads every email that goes through gmail; I kind of assumed that that was to work out what advertising it could plague the gmail user with. Does it go beyond that, building profiles that are used elsewhere such as selling to life insurance (build risk profiles). Will AI lead to further leakage ?

Comment Re:Good luck with that (Score 1) 76

by Alain Williams on Thursday May 29, 2025 @10:44AM (#65413589) Attached to: AI May Already Be Shrinking Entry-Level Jobs In Tech, New Research Suggests

Often the PHB attitude is: training is something that other companies do so that I can then hire their good people.

Submission + - Ex-Microsoft engineer sacked over protest of Israel AI ties (aljazeera.com)

Submitted by Alain Williams on Wednesday May 21, 2025 @06:05PM

Alain Williams writes: Vaniya Agrawal, a former Microsoft engineer, says she was fired for protesting against what she says is the company’s ties to Israel.
Watch her story a 5 minute video.

Comment The next James Bond adventure (Score 3, Funny) 67

by Alain Williams on Tuesday May 20, 2025 @03:35AM (#65389579) Attached to: CERN Gears Up To Ship Antimatter Across Europe

The producers now have the background for the plot of their next film.

Comment So if entry level workers ... (Score 4, Insightful) 36

by Alain Williams on Monday May 19, 2025 @02:17PM (#65388089) Attached to: LinkedIn Executive Warns AI Threatens Entry-Level Jobs as Graduate Unemployment Rises

cannot gain employment to become more experienced where do companies recruit their middle experienced workers from ?

I suspect that they will try to poach them from other companies or other countries that have not replaced new intake workers with AI. I can see this causing a big headache in a few years time.

Comment Re: 86 (Score 1) 91

by Alain Williams on Saturday May 17, 2025 @09:09PM (#65384071) Attached to: Intel Struggles To Reverse AMD's Share Gains In x86 CPU Market

Here is the story at the BBC. When I first saw it I wondered if the reporter had been drinking - until I saw it somewhere else!

Comment Re:Unicode is a bug (Score 1) 69

by Alain Williams on Saturday May 17, 2025 @03:28PM (#65383561) Attached to: Curl Warns GitHub About 'Malicious Unicode' Security Issue

If it ain't ascii it isn't worth expressing in bytes.

If you exclusively speak American then you can say everything is US ASCII ... but for many who, reasonably, want to express themselves in their own language they will want other characters. But the "everything" is not entirely true even for Americans, eg 1/100 of a dollar is a cent which is U+00A2 - which slashdot will not display correctly.

Comment China is to be congratulated (Score 4, Insightful) 104

by Alain Williams on Friday May 16, 2025 @10:15AM (#65380707) Attached to: Clean Energy Just Put China's CO2 Emissions Into Reverse For First Time

No matter what your politics is.

Comment The only way that I can solve a Rubik's cube ... (Score 1) 26

by Alain Williams on Tuesday May 13, 2025 @07:03PM (#65374617) Attached to: Student's Robot Obliterates 4x4 Rubik's Cube World Record

is to peel off the coloured squares and put them back - one colour per face. This, unfortunately, takes more than 15.71 seconds.

Slashdot Top Deals