Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror

Comment Re:Large datasets are mostly IO limited (Score 4, Informative) 135

Hi - MapD creator here. Agreed, GPUs aren't going to me of much use if you have petabytes of data and are I/O bound, but what I think unfortunately gets missed in the rush to indiscriminately throw everything into the "big data bucket" is that a lot of people do have medium-sized (say 5GB-500GB) datasets that they would like to query, visualize and analyze in an iterative, real-time fashion, something that existing solutions won't allow you to do (even big clusters often incur enough latency to make real-time analysis difficult).

And then you have super-linear algorithms like graph processing, spatial joins, neural nets, clustering, rendering blurred heatmaps which do really well on the GPU, which the formerly memory bound speedup of 70X turns into 400-500X. Particularly since databases are expected to do more and more viz and machine learning, I don't think these are edge cases

Finally, although GPU memory will always be more expensive (but faster) than CPU memory, MapD already can run on a 16-card 128GB GPU ram server, and I'm working on a multi-node distributed implementation where you could string many of these together. So having a terabyte of GPU RAM is not out of the question, which, given the column-store architecture of the db can be used more efficiently by caching only the necessary columns in memory. Of course it will cost more, but for some applications the performance benefits may be worth it.

I just think people need to realize that different problems need different solutions, and just b/c a system is not built to handle a petabyte of data doesn't mean its not worthwhile.

Comment Re:That Didn't Take Long: Database Down For Maint. (Score 5, Informative) 135

Har har... Well things got tricky when I wrote the code to support streaming inserts (not implemented in the current map) so you could view tweets or whatever else as they came in - this required a lot of fine-grained locking. May just bandaid this and give locks to connections as they come in until I can figure out what's going on. Todd

Comment Re:sounds like... (Score 5, Informative) 135

So I use postgres all the time, but MapD isn't built on Postgres, it actually stores its own data on disk in column-form in (I admit crude) memory-mapped files. I have written a Postgres connector that connects MapD to Postgres though since I use postgres to store the tweets I harvest for long-term archiving. The connector uses pqxx (the C++ Postgres library). Todd

Comment Re:PostgreSQL used GPU 2 years ago (Score 5, Informative) 135

The 70X is actually highly conservative - and this was benched against an optimized parallelized main-memory (i.e. not off of disk) CPU version, not say MySQL. On things like rendering heatmaps, graph query operations, or clustering you can get 300-500X speedups. The database caches what it can in GPU memory (could be 128GB on one node if you have 16 GPUs) and only sends back a bitmap of the results to be joined with data sitting in CPU memory. But yeah, if the data's not cached, then it won't be this fast. That's true, a lot of work has been done on GPU database processing - this is a bit different I think b/c it runs on multiple GPUs and b/c it tries to cache what it can on the GPU. Todd (MapD creator)

Comment Re:sounds like... (Score 5, Informative) 135

Hi, MapD creator here - and I have to disagree with you. The database ultimately stores everything on disk, but it caches what it can in GPU memory and performs all the computation there. So all the SQL operations are occurring on the GPU, after which, in case of the tweetmap demo, the results are rendered to a texture before being sent out as a png. But it works equally well as a traditional database - it doesn't do the whole SQL standard yet but can handle aggregations, joins, etc just like a normal database, just much faster. Todd

Slashdot Top Deals

If it's worth hacking on well, it's worth hacking on for money.

Working...