Optimizations SQL Cool sing ora gumantung ing model biaya. Bagean 1
4. Ngilangi predikat "ora ana guna".
Sing padha tegese yaiku predikat sing (meh) mesthi bener. Minangka sampeyan bisa mbayangno, yen sampeyan takon:
SELECT * FROM actor WHERE 1 = 1;
...banjur database ora bakal bener nglakokaké, nanging mung bakal nglirwakake.
Aku tau mangsuli pitakonan babagan iki ing Stack Overflow lan mulane aku mutusake nulis artikel iki. Aku bakal ninggalake testing iki minangka latihan kanggo maca, nanging apa mengkono yen predikat punika sethitik kurang "ora ana guna"? Tuladhane:
SELECT * FROM film WHERE release_year = release_year;
Apa sampeyan kudu mbandhingake nilai kasebut dhewe kanggo saben baris? Ora, ora ana nilai sing predikat iki bakal
PALSU , ta? Nanging kita isih kudu mriksa. Sanajan predikat ora bisa padha karo
FALSE , bisa uga padha karo
NULL nang endi wae , maneh amarga logika telung nilai. Kolom
RELEASE_YEAR ora bisa dibatalake, lan yen ana baris sing duwe
RELEASE_YEAR IS NULL , banjur
NULL = NULL ngasilake
NULL lan baris kasebut kudu diilangi. Dadi panjaluk kasebut dadi:
SELECT * FROM film WHERE release_year IS NOT NULL;
Database sing nindakake iki?
DB2
ya wis!
Explain Plan
-------------------------------------------------
ID | Operation | Rows | Cost
1 | RETURN | | 49
2 | TBSCAN FILM | 1000 of 1000 (100.00%) | 49
Predicate Information
2 - SARG Q1.RELEASE_YEAR IS NOT NULL
MySQL
Iku isin, nanging MySQL, maneh, ora map predikat menyang rencana eksekusi, supaya ngerti apa MySQL ngleksanakake optimasi tartamtu punika sethitik angel. Sampeyan bisa nindakake evaluasi kinerja lan ndeleng manawa ana perbandingan skala gedhe. Utawa sampeyan bisa nambah indeks:
CREATE INDEX i_release_year ON film (release_year);
Lan entuk rencana kanggo panjaluk ing ngisor iki:
SELECT * FROM film WHERE release_year = release_year;
SELECT * FROM film WHERE release_year IS NOT NULL;
Yen optimasi bisa digunakake, banjur rencana saka loro pitakon kudu kira-kira padha. Nanging ing kasus iki ora:
ID TABLE POSSIBLE_KEYS ROWS FILTERED EXTRA
------------------------------------------------------
1 film 1000 10.00 Using where
ID TABLE POSSIBLE_KEYS ROWS FILTERED EXTRA
------------------------------------------------------
1 film i_release_year 1000 100.00 Using where
Kaya sing sampeyan ngerteni, rong pitakon kita beda banget ing nilai kolom
POSSIBLE_KEYS lan
FILTERED . Dadi aku bakal ngira manawa MySQL ora ngoptimalake iki.
Oracle
ya wis!
----------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |
----------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | |
|* 1 | TABLE ACCESS FULL| FILM | 1 | 1000 |
----------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("RELEASE_YEAR" IS NOT NULL)
PostgreSQL
Sayange ora!
QUERY PLAN
--------------------------------------------------------------
Seq Scan on film (cost=0.00..67.50 rows=5 width=386)
Filter: ((release_year)::integer = (release_year)::integer)
Rencana lan biaya beda-beda. Yaiku, deleng evaluasi kardinalitas, sing pancen ora apik, dene predikat iki:
SELECT * FROM film WHERE release_year IS NOT NULL;
menehi asil sing luwih apik:
QUERY PLAN
---------------------------------------------------------
Seq Scan on film (cost=0.00..65.00 rows=1000 width=386)
Filter: (release_year IS NOT NULL)
Bummer!
SQL Server
Anehe, SQL Server uga ora nindakake iki:
|--Table Scan(OBJECT:([film]), WHERE:([release_year]=[release_year]))
Nanging, adhedhasar tampilan rencana, penilaian kardinalitas bener, uga biaya. Nanging ing pengalaman karo SQL Server, aku bakal ngomong yen ing kasus iki, ora ana optimasi, amarga SQL Server bakal nampilake predikat sing bener dieksekusi ing rencana kasebut (kanggo ndeleng sebabe, deleng conto kendala
CHECK ing ngisor iki). Kepiye babagan predikat "ora ana guna" ing kolom
NOT NULL ? Konversi ing ndhuwur mung perlu amarga
RELEASE_YEAR ora bisa ditemtokake. Apa sing kedadeyan yen sampeyan nindakake pitakon sing ora ana gunane, contone, kolom
FILM_ID ?
SELECT * FROM film WHERE film_id = film_id
Apa saiki ora cocog karo predikat? Utawa paling ora kaya ngono. Nanging iku?
DB2
ya wis!
Explain Plan
-------------------------------------------------
ID | Operation | Rows | Cost
1 | RETURN | | 49
2 | TBSCAN FILM | 1000 of 1000 (100.00%) | 49
Ora ana predikat sing ditrapake lan kita milih kabeh film.
MySQL
ya wis! (Maneh, guess sing dididik)
ID TABLE POSSIBLE_KEYS ROWS FILTERED EXTRA
------------------------------------------------------
1 film 1000 100.00
Elinga yen kolom
EXTRA saiki kosong, kaya-kaya ora ana
klausa WHERE!
Oracle
ya wis!
----------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |
----------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | |
| 1 | TABLE ACCESS FULL| FILM | 1 | 1000 |
----------------------------------------------------
Maneh, ora ana predikat sing ditrapake.
PostgreSQL
Wah, ora maneh!
QUERY PLAN
------------------------------------------------------
Seq Scan on film (cost=0.00..67.50 rows=5 width=386)
Filter: (film_id = film_id)
Filter ditrapake lan skor kardinalitas isih 5. Bummer!
SQL Server
Lan kene maneh ora!
|--Table Scan(OBJECT:([film]), WHERE:([film_id]=[film_id]))
Ringkesan
Iku misale jek kaya optimasi prasaja, nanging ora digunakake ing kabeh DBMSs; utamané, cukup aneh, iku ora digunakake ing SQL Server!
Database |
Predikat tanpa makna nanging perlu (null semantik) |
Predikat tanpa arti lan ora perlu (semantik non-NULL) |
DB2 LUW 10.5 |
ya wis |
ya wis |
MySQL 8.0.2 |
Ora |
ya wis |
Oracle 12.2.0.1 |
ya wis |
ya wis |
PostgreSQL 9.6 |
Ora |
Ora |
SQL Server 2014 |
Ora |
Ora |
5. Proyeksi ing EXISTS subqueries
Sing nggumunake, aku tansah takon babagan dheweke ing kelas master, ing ngendi aku mbela sudut pandang sing
SELECT * biasane ora nyebabake kabecikan. Pitakonan yaiku: apa bisa nggunakake
SELECT * ing subquery
EXISTS ? Contone, yen kita kudu golek aktor sing main ing film ...
SELECT first_name, last_name
FROM actor a
WHERE EXISTS (
SELECT * -- Is this OK?
FROM film_actor fa
WHERE a.actor_id = fa.actor_id
)
Lan jawabane ... ya. Saget. Tanda bintang ora mengaruhi panyuwunan. Kepiye carane sampeyan bisa yakin babagan iki? Coba pitakon ing ngisor iki:
-- DB2
SELECT 1 / 0 FROM sysibm.dual
-- Oracle
SELECT 1 / 0 FROM dual
-- PostgreSQL, SQL Server
SELECT 1 / 0
-- MySQL
SELECT pow(-1, 0.5);
Kabeh database iki laporan divisi dening kesalahan nul. Wigati kasunyatan sing menarik: ing MySQL, nalika dibagi karo nol, kita entuk
NULL tinimbang kesalahan, mula kita kudu nindakake perkara liya sing ora diidini. Saiki, apa sing kedadeyan yen kita nglakokake, tinimbang ing ndhuwur, pitakon ing ngisor iki?
-- DB2
SELECT CASE WHEN EXISTS (
SELECT 1 / 0 FROM sysibm.dual
) THEN 1 ELSE 0 END
FROM sysibm.dual
-- Oracle
SELECT CASE WHEN EXISTS (
SELECT 1 / 0 FROM dual
) THEN 1 ELSE 0 END
FROM dual
-- PostgreSQL
SELECT EXISTS (SELECT 1 / 0)
-- SQL Server
SELECT CASE WHEN EXISTS (
SELECT 1 / 0
) THEN 1 ELSE 0 END
-- MySQL
SELECT EXISTS (SELECT pow(-1, 0.5));
Saiki ora ana database sing ngasilake kesalahan. Kabeh padha bali
BENER utawa
1 . Iki tegese ora ana database sing bener ngevaluasi proyeksi (yaiku, klausa
SELECT ) saka subkueri
EXISTS . SQL Server, contone, nuduhake rencana ing ngisor iki:
|--Constant Scan(VALUES:((CASE WHEN (1) THEN (1) ELSE (0) END)))
Kaya sing sampeyan ngerteni, ekspresi
CASE wis diowahi dadi konstanta lan subquery wis diilangi. Basis data liyane nyimpen subquery ing rencana lan ora nyebutake apa wae babagan proyeksi kasebut, mula ayo deleng maneh rencana pitakon asli ing Oracle:
SELECT first_name, last_name
FROM actor a
WHERE EXISTS (
SELECT *
FROM film_actor fa
WHERE a.actor_id = fa.actor_id
)
Rencana pitakon ing ndhuwur katon kaya iki:
------------------------------------------------------------------
| Id | Operation | Name | E-Rows |
------------------------------------------------------------------
| 0 | SELECT STATEMENT | | |
|* 1 | HASH JOIN SEMI | | 200 |
| 2 | TABLE ACCESS FULL | ACTOR | 200 |
| 3 | INDEX FAST FULL SCAN| IDX_FK_FILM_ACTOR_ACTOR | 5462 |
------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("A"."ACTOR_ID"="FA"."ACTOR_ID")
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=1) LAST_NAME, FIRST_NAME
2 - (rowset=256) A.ACTOR_ID, FIRST_NAME, LAST_NAME
3 - FA.ACTOR_ID
Kita mirsani informasi babagan proyeksi ing
Id=3 . Nyatane, kita malah ora ngakses
FILM_ACTOR tabel amarga kita ora perlu. Predikat
EXISTS bisa ditindakake nggunakake indeks kunci asing ing kolom
ACTOR_ID siji - kabeh sing dibutuhake kanggo pitakon iki - sanajan kita nulis
SELECT * .
Ringkesan
Untunge, kabeh database mbusak proyeksi saka
EXISTS subqueries :
Database |
Proyeksi ana |
DB2 LUW 10.5 |
ya wis |
MySQL 8.0.2 |
ya wis |
Oracle 12.2.0.1 |
ya wis |
PostgreSQL 9.6 |
ya wis |
SQL Server 2014 |
ya wis |
Tetep dirungokake kanggo
Part 3 , ing ngendi kita bakal ngrembug optimasi SQL liyane.
GO TO FULL VERSION