Difference:
1. Partition is folder
Bucket is file
2. Go with partition when there are less number of distinct values in the column.
Go with bucketing when there are more number of distinct values in the column.
3. partition are logical division
Bucket are based on hash (here we should go with a fix no. of buckets)
4. Partition syntax:
Create table table_name(col1 datatype, col2 datatype,col3 datatype)
partition by (col4 datatype,col5 datatype);
bucketing syntax:
Create table table_name(col1 datatype, col2 datatype,col3 datatype)
partition by (col4 datatype,col5 datatype)
clustered by (col2) into 50 Buckets;
Vidoe: https://www.youtube.com/watch?v=WKpELsHZ0Zc&list=PLVt87wOZJLOdtvKLe6X846CbuFNKOX95E&index=2
No comments:
Post a Comment