Sqoop split by string column
Sqoop split by string column
This method needs to determine the splits between two user-provided strings. In the case where the user’s strings are ‘A’ and ‘Z’, this is not hard; we could create two splits from [‘A’, ‘M’) and [‘M’, ‘Z’], 26 splits for strings beginning with each letter.
Ex1. in table first row contain value A and Z and try to with mapper 3 then 3 split, A-I in part-00000, J-Q in part-00001, R-Z in part-00002, if we try split 4 then A-G in part-00000, H-M in part-00001, N-S in part-00002, T-Z in part-00004
Ex2. in table first row contain value A and D and try to with mapper 2 then 2 split, A-B in part-00000, C-D in part-00001
Ex3. in table first row contain value A and C and try to with mapper 3 then 3 split, A in part-00000, B in part-00001, C in part-00002, if no value for B then part-00001 is empty.
Refrence: http://hadoop.apache.org/docs/r2.10.0/api/org/apache/hadoop/mapreduce/lib/db/TextSplitter.html