How to speed up search query speed in DB?

Vadym

do u even auto,m8?
Jul 11, 2009
1,703
24
0
I got my progger to code me a custom db over past few months for me to throw in a load of my data to organize/etc/start scraping more info on them.

Now that everything is ready - and data is loaded up, there's new problem. There's 100ms of lines of data, so searching for a single query can take up to 10 minutes.

Making multiple search for even 20-30 queries HOURS.

I'm not a programmer - always been a concept person, but I'm hoping someone has a suggestion I can pass on to my progger to send him in a solution direction.

Right now the alternative is going to be to internally split up data into columns so that a query isn't happening to the one list - but to 100 little lists at a time.

Plz help?

Thanks.


jen selter ass because it's my new fave.

71f35f52809c11e384d00e5a8c16bf4a_8.jpg

4bc47e5e7cca11e3a49c12b76d6e4152_8.jpg

ec16a6ea768811e3b825122510c6161c_8.jpg

3fe11f4c741511e396f3121282ee2c71_8.jpg

3917899a703111e3b6880ace8c04aaa3_8.jpg

6538c2be621211e38cc11272270ba361_8.jpg
 


Without knowing what's in the database or the type of database (SQL, MySQL, Mongo, Couch, etc), we have no way of helping you.
 
^ thanks, thats what I mean by lack of programming knowledge on my part.

MSSQL
 
Well, first thing you may want to do is get off MSSQL.

Anyway, Google for "MSSQL slow query log", and I'm sure there's a feature in MSSQL or a script you can install that will log any queries that take over X seconds. From there, it's just a matter of optimization -- ensure your tables are structured properly, proper indexes are in place, queries are optimized, etc.

Also, is there many writes being done to the database? If so, you may want to get database replication going, and split the writes off to a separate database. A write will lock the table, and with large amounts of data, can be for quite a while. So any other queries have to sit there waiting for that lock to release. For example, if you have 60 writes sitting in queue, each of which takes 15 seconds to execute, then you try to query the database, your query isn't even going to start looking at the database for 15 minutes.

Failing all that, look into partitioning your data somehow, and modify the software to query the correct partition(s).
 
  • Like
Reactions: Vadym
I have a feeling there is no index in place for the database. That right there should reduce your queries by more half. 10 min for 100MM lines, there is no index is place, has to be it. Solution - SQL CREATE INDEX Statement Also make sure the index is the primary key - SQL PRIMARY KEY Constraint

I'm making all these assumptions based of off experience from working with MYSQL. MSSQL - good lord...​
 
^ what ccarter said. Make sure you have indexes on any columns you will be searching by. Also, make sure you have enough ram / ram assigned to the database.

MSSQL is fine, don't switch.
 
Make sure you have indexes on any columns you will be searching by.

Be careful with this as well though. Definitely need indexes, but don't just add in say 8 indexes into the table, and expect it to run faster. If anything, it will run slower.

Honestly, just grab a qualified DBA. Will cost you anywhere from say $500 - $3000, depending on what needs to be done.

Assuming proper indexes are in place, I would imagine locking is your next biggest problem, and replication to syphon the reads / writes to different databases is the solution to that. I've done it several times before, and works beautifully. You have a bunch of affiliates hammering away at the F5 button 100 times/hour, but also a bunch of writes in queue, so the affiliates get a timeout message because the tables are always locked.

Split it up, so writes are sent to one database, and reads are sent to another, and everything works great. Affiliate stats would be off by a few mins, but they didn't care. As long as they could hit that F5 button 100 times/hour, they seemed happy. Same as when you send a PayPal payment, sometimes it won't show up in your Recent History list for a few mins. Same shit, different pile.
 
Databases aren't always that great at indexing text. What type of queries are you doing?

If you are doing queries where you search through a text field with a wild card like, such as LIKE '%search text%'... then you are going to have a bad day.

If you need to do something like the above, then you should look at utilizing MSSQL's Full-Text Search index, or use a real search engine like Sphinx.

If you are just searching for an integer, small amount of text without a left wildcard, etc... you probably just need an index like others have suggested.
 
Most coders tend to suck, so heres some things to look for since idk wtf your db is like.

Heres basic mistakes I see all the time making me lose faith in all programmers unless thoroughly tested.

1) SELECT * is shit, they should select by column
2) Dont put database queries in a loop or perform a shitload of queries on 1 page.
3) LIKE '%$kw%' is a bad way to search and it gets worse with more data to search through
4) Use indexing, but use it on a table that only does reads. Index's need to reindex after a write. Fucking this up makes you slower
5) Look at your database. Are their columns that serve no purpose except to have "Yes" or "True" or "No" or "False" or some other data where there is only a limited number of answers stored as a string? OMFG you and a billion other dudes. Integer comparisons are the fastest. For boolean yes/no go with datatype integer 0 or 1 or NULL / 0.
6) If you have other columns storing this data thats limited to a set number of answers, maybe like "California" or "Author" or whatever. Store that data in another table with a unique id, make sure its indexed, reference that id from your main table instead of duplicating "callifornia" throughout your entire db. jesus christ. As a database gets bigger, id do this for first names, last names, email address (name, domain, tld) to use only integer comparisons.
7) Look at the datatypes for the columns. Are you storing any integers as strings? Use the correct data types for timestamps, dates, integers and strings.
8) Set your character lengths correctly. If a column should only have 140 characters, specify 140 characters. This may or may not help depending on the database+version

If your people cant do this, message me. My people are constantly having 17 years[my experience] of web development jammed down their throat. In 2009 I was sending 60,000,000 emails a day and we used some of the optimizations above to do it so we could finish before lunchtime.
 
I should note, depending on what you're searching through, making a column that has all the punctuation removed, all lowercase, with words like "a", "the", "consequently", "my" etc removed, and then base36 encode the data and the search parameters before searching[of course] would really help speed you up. For bonus points, encode base36 client-side in a hidden input field.

also, cached queries. lol forgot that one. You have tons of options, it depends on your setup, server control, how much data you'll be adding to this database. It all matters. Beyond cached queries, page caching and browser caching should be something you'll want to do as well