Request for Large Databases (100,000+ rows) to Practice with

  • Thread starter Thread starter WWGD
  • Start date Start date
  • Tags Tags
    Request
AI Thread Summary
The discussion centers around seeking larger-scale databases for SQL Server practice, moving beyond small databases of fewer than 200 rows. Participants highlight that database size is relative, with 100 million rows considered large in some contexts. Recommendations include exploring the MySQL website for a test database with 4 million rows and considering CSV data from various sources for incremental uploads. The conversation also touches on the capabilities of different relational database management systems (RDBMS), noting that Oracle is highly regarded for SQL careers, while Microsoft SQL is seen as less robust. Free versions of Oracle Database 11g Express, Microsoft SQL Server Express, and IBM's Db2 Express-C are suggested for training. The importance of understanding indexing and hashing for managing large datasets is emphasized, with one participant mentioning personal experience with databases containing over a billion entries.
WWGD
Science Advisor
Homework Helper
Messages
7,701
Reaction score
12,750
Hi All,
I am trying to up my game in SQL Server in general. Specifically, my knowledge has been obtained so far by working with very small-scale databases ( fewer than 200 rows). Does anyone know of free larger-scale databases that are available?
Thanks.
 
Technology news on Phys.org
Hmm. Before I retired, this was considered a small database. 100 million and up was getting large, but not very.
 
I don't know about 100,000 rows but there's a test database on the MySQL website here that is described as 'large'. See under the 'Example databases' heading.
 
  • Like
Likes WWGD
PAllen said:
Hmm. Before I retired, this was considered a small database. 100 million and up was getting large, but not very.
I guess the terms small and large depend on the machine available. With 100,000 I thought of an Oracle database to handle best, but I wouldn't like to handle 100,000,000 other than on a DB2. However, I think that even 100,000 are a big deal for MySQL. How about some csv data taken from anywhere or a continuous feed (rss I think) from a weather page and upload them piece by piece? The entire amount at once might be a disappointing experience.
 
  • Like
Likes WWGD
andrewkirk said:
I don't know about 100,000 rows but there's a test database on the MySQL website here that is described as 'large'. See under the 'Example databases' heading.
If you look down into the doc for that dB described as large, it is, in fact, 4 million rows spread across 6 tables.
 
Last edited:
  • Like
Likes WWGD
fresh_42 said:
I guess the terms small and large depend on the machine available. With 100,000 I thought of an Oracle database to handle best, but I wouldn't like to handle 100,000,000 other than on a DB2. However, I think that even 100,000 are a big deal for MySQL. How about some csv data taken from anywhere or a continuous feed (rss I think) from a weather page and upload them piece by piece? The entire amount at once might be a disappointing experience.
I see, so I think then I may not even need a scraper, but even Excel's import/export Wizard would do, it seems.
 
Oracle is the database to know if you are set on making SQL your career path. Microsoft SQL is "Oracle-Lite" at best. IBM's DB/2 is also a decent RDBMS, just not as popular as Oracle or MS-SQL. All three RDBMS' have pretty much unlimited number of records (MS-SQL limits you to 16 terrabytes). Oracle allows up to 1,000 columns per record. MS-SQL limits you to 8,060 bytes per record. Oracle allows for recursive scripts, MS-SQL does not. Microsoft also has their own version of SQL, they no longer use the ANSI Standard for SQL (particularly with regard to their joins), but Oracle does.

You can get the Oracle Database 11g Express Edition for free for training and development purposes. Microsoft offers the same thing for MS-SQL Server Express. If you are really interested, you can also get a free copy of IBM's Db2 Express-C.

Also, check out Oracle's, Microsoft's, and IBM's developer websites. They will often include free tools for developers.
 
  • Like
Likes WWGD and fresh_42
You could also just install your own Linux VM, apt-get or yum MySQL-server and insert whatever you want. Delete the vm was hen you are done.

Btw, 100,000 rows is a medium size database at best. I have over a billion entries in some of mine. The most important thing to understand is indexing and hashing when dealing with things that big. Different types of indexes are optimized for different types of data and queries.
 
  • Like
Likes WWGD
Back
Top