regex - Creating structured hive table with unstructured GPS packets in csv format -


I have a CSV file as outlined below.

VTS, 51,0071,9739 9 65515, nm, gp, inf 01, v, 19, 7 9, 19, 291014, 0000.0000, n, 00000.0000, e, 07 AE VTS, 01.0097 9 7 9 9 65515, SP, GP, 18,07,2253, V, 0000.0000, N, 00000.0000, e, 0.0,0.0,291014,0000,00,4000,11,999,169, B205 VTS, 51, 0071.9739965515, NM, GP, INF0l, V, 18,072311,291014,0000.0000, N, 00000.0000, e, C24E VTS, 01, 0097.9739965515, NM, GP, 19.072311 , V, 0,000.0000, N, 00,000.0000, E, 0.0,0.0,291014,0000,00,4000,11,999,171, B358 VTS, 51,0071,9739965515, knots away, GP, INF01, V, 18,072319,291014,0000.0000, N, 00,000.0000, E, 012F VTS, 51,0071,9739965515, Mono miles away, GP, INF01, V, 19,072326,291014,0000.0000, N, 00,000.0000, E, B2E6 VTS, 0170097, 79, 965515, nm, gp, 18,072326, V, 0000.0000, N, 00000.0000, e, 0.0,0.0,291014,0000,00,4000,11,999,173, EAA 00 VTS, 5l,007l 9 739 9 65 515 nm, GP, INF 01, V, 18,072333,291014, 0000.0000, N, 00000.0000, e 98 9 6 VTS, 5l,007l 9 7 9 9 65 515 nm, GP, INF 01, V, 18,072340,291014,0000.0000, N, 00000.0000, e, 9 B 23

it is to be mapped with the field:

Pkt_header, Gprs_pkt_id, Pkt_length, Sim_no, Msg_id, Gprs_pkt, Gsm_sig_strength, Utc_time, Pkt_validation, latitude, Direction_n_s, longitude, direction_e_w, speed, track_angle, u Tc_date, fuel_adc_values, ignition, odometer_values, supply_int, battery_adc, pkt_id, check_sum

Valid packets are shown in the second field ie GPRS_PKT_ID value 01. The case I used is filtering the CSV data only for valid packets, for which I am using regex, but I am not able to get the full data. Any help would be greatly appreciated

The used hive query is shown below.

Create external table sky_track_testing1 (pkt_header STRING, gprs_pkt_id STRING, pkt_length STRING, sim_no STRING, msg_id STRING, gprs_pkt STRING, gsm_sig_strength STRING, utc_time STRING, pkt_validation STRING, Latitude STRING, STRING longitude STRING, direction_e_w STRING, speed STRING, track_angle STRING, utc_date STRING, fuel_adc_values ​​STRING, ignition STRING, odometer_values ​​STRING, supply_int STRING, battery_adc STRING, pkt_id STRING, check_sum STRING direction_n_s) row format SERDE 'org.apache.hadoop.hive.contrib .serde2.RegexSerDe with SERDEPROPERTIES ("input.regex" = "^ (VTS, 01). * $") Stored as textfile location '/ user / root / sky_track';

This is definitely a wrong query Please help me.

I recommend that you use it for:

  A = Load '/ user / root / sky_track' (pkt_header, gprs_pkt_id, pkt_length, sim_no, msg_id, gprs_pkt, gsm_sig_strength, utc_time, pkt_validation, latitude, direction_n_, longitude, direction_a_va, speed, track_elegal, utc_date, fuel_adc_value, ignition, Odometer_value, supply_int, battery_adc, pkt_id, check_sum); B = Filter gprs_pkt_id == '01'; Store 'b' / user / root / sky_track_valid ';  

Comments

Popular posts from this blog

apache - 504 Gateway Time-out The server didn't respond in time. How to fix it? -

c# - .net WebSocket: CloseOutputAsync vs CloseAsync -

c++ - How to properly scale qgroupbox title with stylesheet for high resolution display? -