development

INNER JOIN ON 대 WHERE 절

big-blog 2020. 9. 28. 09:29
반응형

INNER JOIN ON 대 WHERE 절


단순화를 위해 모든 관련 필드가 NOT NULL.

넌 할 수있어:

SELECT
    table1.this, table2.that, table2.somethingelse
FROM
    table1, table2
WHERE
    table1.foreignkey = table2.primarykey
    AND (some other conditions)

그렇지 않으면:

SELECT
    table1.this, table2.that, table2.somethingelse
FROM
    table1 INNER JOIN table2
    ON table1.foreignkey = table2.primarykey
WHERE
    (some other conditions)

이 두 가지가에서 같은 방식으로 작동 MySQL합니까?


INNER JOIN 사용해야하는 ANSI 구문입니다.

특히 많은 테이블을 조인 할 때 일반적으로 더 읽기 쉬운 것으로 간주됩니다.

또한 OUTER JOIN필요할 때마다 쉽게 교체 할 수 있습니다 .

WHERE구문은 더 관계형 모델을 지향한다.

두 테이블 JOINed 의 결과는 조인 열이 일치하는 행만 선택하는 필터가 적용된 테이블의 데카르트 곱입니다.

WHERE구문을 사용하면 더 쉽게 볼 수 있습니다.

예를 들어, MySQL (및 일반적으로 SQL)에서이 두 쿼리는 동의어입니다.

또한 MySQL에는 STRAIGHT_JOIN절이 있습니다.

이 절을 사용하여 JOIN순서를 제어 할 수 있습니다 . 즉, 외부 루프에서 스캔되는 테이블과 내부 루프에있는 테이블을 제어 할 수 있습니다 .

WHERE구문을 사용하여 MySQL에서이를 제어 할 수 없습니다 .


다른 사람들은 INNER JOIN이 사람의 가독성에 도움이되며 이것이 최우선 순위라고 지적했습니다. 동의한다. 조인 구문이 더 읽기 쉬운 이유 를 설명하겠습니다 .

기본 SELECT 쿼리는 다음과 같습니다.

SELECT stuff
FROM tables
WHERE conditions

SELECT 절은 우리가 무엇 을 되 찾는 지 알려줍니다 . FROM 절은 우리에게 어디 우리가에서 그것을 얻고, 그리고 WHERE 절은 우리에게 이야기 하는 우리가 얻고있는 것.

JOIN은 테이블이 어떻게 결합되는지 (개념 상 실제로는 단일 테이블로) 테이블에 대한 설명입니다. 테이블을 제어하는 ​​모든 쿼리 요소는 의미 상 FROM 절에 속합니다 (물론 JOIN 요소가 이동하는 곳입니다). 결합 요소를 WHERE 절에 넣으면 whichwhere-from이 병합 됩니다 . 이것이 JOIN 구문이 선호되는 이유입니다.


ON / WHERE에서 조건문 적용

여기에서는 논리적 쿼리 처리 단계에 대해 설명했습니다.


참조 : Inside Microsoft® SQL Server ™ 2005 T-SQL 쿼리
발행인 : Microsoft Press
Pub 날짜 : 2006 년 3 월 7 일
인쇄판 ISBN-10 : 0-7356-2313-9
인쇄판 ISBN-13 : 978-0-7356-2313-2
페이지 : 640

Microsoft® SQL Server ™ 2005 T-SQL 쿼리 내부

(8)  SELECT (9) DISTINCT (11) TOP <top_specification> <select_list>
(1)  FROM <left_table>
(3)       <join_type> JOIN <right_table>
(2)       ON <join_condition>
(4)  WHERE <where_condition>
(5)  GROUP BY <group_by_list>
(6)  WITH {CUBE | ROLLUP}
(7)  HAVING <having_condition>
(10) ORDER BY <order_by_list>

다른 프로그래밍 언어와 다른 SQL의 첫 번째 눈에 띄는 측면은 코드가 처리되는 순서입니다. 대부분의 프로그래밍 언어에서 코드는 작성된 순서대로 처리됩니다. SQL에서 처리되는 첫 번째 절은 FROM 절이고 가장 먼저 나타나는 SELECT 절은 거의 마지막에 처리됩니다.

각 단계는 다음 단계에 대한 입력으로 사용되는 가상 테이블을 생성합니다. 이러한 가상 테이블은 호출자 (클라이언트 응용 프로그램 또는 외부 쿼리)가 사용할 수 없습니다. 마지막 단계에서 생성 된 테이블 만 호출자에게 반환됩니다. 쿼리에 특정 절이 지정되지 않은 경우 해당 단계를 건너 뜁니다.

논리 쿼리 처리 단계에 대한 간략한 설명

Don't worry too much if the description of the steps doesn't seem to make much sense for now. These are provided as a reference. Sections that come after the scenario example will cover the steps in much more detail.

  1. FROM: A Cartesian product (cross join) is performed between the first two tables in the FROM clause, and as a result, virtual table VT1 is generated.

  2. ON: The ON filter is applied to VT1. Only rows for which the <join_condition> is TRUE are inserted to VT2.

  3. OUTER (join): If an OUTER JOIN is specified (as opposed to a CROSS JOIN or an INNER JOIN), rows from the preserved table or tables for which a match was not found are added to the rows from VT2 as outer rows, generating VT3. If more than two tables appear in the FROM clause, steps 1 through 3 are applied repeatedly between the result of the last join and the next table in the FROM clause until all tables are processed.

  4. WHERE: The WHERE filter is applied to VT3. Only rows for which the <where_condition> is TRUE are inserted to VT4.

  5. GROUP BY: The rows from VT4 are arranged in groups based on the column list specified in the GROUP BY clause. VT5 is generated.

  6. CUBE | ROLLUP: Supergroups (groups of groups) are added to the rows from VT5, generating VT6.

  7. HAVING : HAVING 필터가 VT6에 적용됩니다. <having_condition>가 참인 그룹 만 VT7에 삽입됩니다.

  8. SELECT : SELECT 목록이 처리되어 VT8이 생성됩니다.

  9. DISTINCT : 중복 행이 VT8에서 제거됩니다. VT9가 생성됩니다.

  10. ORDER BY : VT9의 행은 ORDER BY 절에 지정된 열 목록에 따라 정렬됩니다. 커서가 생성됩니다 (VC10).

  11. TOP : VC10의 시작 부분부터 지정된 행 수 또는 백분율이 선택됩니다. 테이블 VT11이 생성되어 호출자에게 반환됩니다.



따라서 (INNER JOIN) ON은 WHERE 절을 적용하기 전에 데이터를 필터링합니다 (VT의 데이터 개수는 여기 자체에서 감소됨). 후속 조인 조건은 성능을 향상시키는 필터링 된 데이터로 실행됩니다. 그 후에는 WHERE 조건 만 필터 조건을 적용합니다.

(Applying conditional statements in ON / WHERE will not make much difference in few cases. This depends how many tables you have joined and number of rows available in each join tables)


The implicit join ANSI syntax is older, less obvious and not recommended.

In addition, the relational algebra allows interchangeability of the predicates in the WHERE clause and the INNER JOIN, so even INNER JOIN queries with WHERE clauses can have the predicates rearrranged by the optimizer.

I recommend you write the queries in the most readble way possible.

Sometimes this includes making the INNER JOIN relatively "incomplete" and putting some of the criteria in the WHERE simply to make the lists of filtering criteria more easily maintainable.

For example, instead of:

SELECT *
FROM Customers c
INNER JOIN CustomerAccounts ca
    ON ca.CustomerID = c.CustomerID
    AND c.State = 'NY'
INNER JOIN Accounts a
    ON ca.AccountID = a.AccountID
    AND a.Status = 1

Write:

SELECT *
FROM Customers c
INNER JOIN CustomerAccounts ca
    ON ca.CustomerID = c.CustomerID
INNER JOIN Accounts a
    ON ca.AccountID = a.AccountID
WHERE c.State = 'NY'
    AND a.Status = 1

But it depends, of course.


Implicit joins (which is what your first query is known as) become much much more confusing, hard to read, and hard to maintain once you need to start adding more tables to your query. Imagine doing that same query and type of join on four or five different tables ... it's a nightmare.

Using an explicit join (your second example) is much more readable and easy to maintain.


I'll also point out that using the older syntax is more subject to error. If you use inner joins without an ON clause, you will get a syntax error. If you use the older syntax and forget one of the join conditions in the where clause, you will get a cross join. The developers often fix this by adding the distinct keyword (rather than fixing the join because they still don't realize the join itself is broken) which may appear to cure the problem, but will slow down the query considerably.

Additionally for maintenance if you have a cross join in the old syntax, how will the maintainer know if you meant to have one (there are situations where cross joins are needed) or if it was an accident that should be fixed?

Let me point you to this question to see why the implicit syntax is bad if you use left joins. Sybase *= to Ansi Standard with 2 different outer tables for same inner table

Plus (personal rant here), the standard using the explicit joins is over 20 years old, which means implicit join syntax has been outdated for those 20 years. Would you write application code using syntax that has been outdated for 20 years? Why do you want to write database code that is?


They have a different human-readable meaning.

However, depending on the query optimizer, they may have the same meaning to the machine.

You should always code to be readable.

That is to say, if this is a built-in relationship, use the explicit join. if you are matching on weakly related data, use the where clause.


The SQL:2003 standard changed some precedence rules so a JOIN statement takes precedence over a "comma" join. This can actually change the results of your query depending on how it is setup. This cause some problems for some people when MySQL 5.0.12 switched to adhering to the standard.

So in your example, your queries would work the same. But if you added a third table: SELECT ... FROM table1, table2 JOIN table3 ON ... WHERE ...

Prior to MySQL 5.0.12, table1 and table2 would be joined first, then table3. Now (5.0.12 and on), table2 and table3 are joined first, then table1. It doesn't always change the results, but it can and you may not even realize it.

I never use the "comma" syntax anymore, opting for your second example. It's a lot more readable anyway, the JOIN conditions are with the JOINs, not separated into a separate query section.


I know you're talking about MySQL, but anyway: In Oracle 9 explicit joins and implicit joins would generate different execution plans. AFAIK that has been solved in Oracle 10+: there's no such difference anymore.


ANSI join syntax is definitely more portable.

I'm going through an upgrade of Microsoft SQL Server, and I would also mention that the =* and *= syntax for outer joins in SQL Server is not supported (without compatability mode) for 2005 sql server and later.


If you are often programming dynamic stored procedures, you will fall in love with your second example (using where). If you have various input parameters and lots of morph mess, then that is the only way. Otherwise they both will run same query plan so there is definitely no obvious difference in classic queries.

참고URL : https://stackoverflow.com/questions/1018822/inner-join-on-vs-where-clause

반응형