Skip to content

Natural join seems to eliminate rows which it shouldn't #977

@alexpdp7

Description

@alexpdp7
MySQL [gitbase]> select blob_hash, repository_id from blobs natural join repositories where blob_hash in ('93ec5b4525363844ddb1981adf1586ebddbc21c1', 'aad34590345310fe813fd1d9eff868afc4cea10c', 'ed82eb69daf806e521840f4320ea80d4fe0af435');
+------------------------------------------+-------------------------------------+
| blob_hash                                | repository_id                       |
+------------------------------------------+-------------------------------------+
| aad34590345310fe813fd1d9eff868afc4cea10c | github.com/bblfsh/javascript-driver |
| ed82eb69daf806e521840f4320ea80d4fe0af435 | github.com/src-d/enry               |
| aad34590345310fe813fd1d9eff868afc4cea10c | github.com/bblfsh/python-driver     |
| 93ec5b4525363844ddb1981adf1586ebddbc21c1 | github.com/src-d/go-mysql-server    |
| aad34590345310fe813fd1d9eff868afc4cea10c | github.com/bblfsh/ruby-driver       |
| ed82eb69daf806e521840f4320ea80d4fe0af435 | github.com/src-d/gitbase            |
+------------------------------------------+-------------------------------------+
6 rows in set (14.90 sec)

MySQL [gitbase]> select blob_hash, repository_id from blobs where blob_hash in ('93ec5b4525363844ddb1981adf1586ebddbc21c1', 'aad34590345310fe813fd1d9eff868afc4cea10c', 'ed82eb69daf806e521840f4320ea80d4fe0af435');
+------------------------------------------+-------------------------------------+
| blob_hash                                | repository_id                       |
+------------------------------------------+-------------------------------------+
| aad34590345310fe813fd1d9eff868afc4cea10c | github.com/bblfsh/python-driver     |
| aad34590345310fe813fd1d9eff868afc4cea10c | github.com/bblfsh/javascript-driver |
| ed82eb69daf806e521840f4320ea80d4fe0af435 | github.com/src-d/enry               |
| aad34590345310fe813fd1d9eff868afc4cea10c | github.com/bblfsh/ruby-driver       |
| 93ec5b4525363844ddb1981adf1586ebddbc21c1 | github.com/src-d/gitbase            |
| ed82eb69daf806e521840f4320ea80d4fe0af435 | github.com/src-d/gitbase            |
| 93ec5b4525363844ddb1981adf1586ebddbc21c1 | github.com/src-d/go-mysql-server    |
| ed82eb69daf806e521840f4320ea80d4fe0af435 | github.com/src-d/go-mysql-server    |
+------------------------------------------+-------------------------------------+
8 rows in set (0.13 sec)

also note that removing the natural join makes things go much faster- it was my understanding that normally we want to join with repositories to benefit from some specific optimizations (although I'm guessing that filtering with blob_hash makes those optimizations moot).

Metadata

Metadata

Assignees

No one assigned

    Labels

    blockedSome other issue is blocking thisbugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions