|
|
完美的杨桃 · 2022年度中国医院综合排行榜发布重庆4家医 ...· 11 月前 · |
|
|
霸气的铅笔 · 城市环境无人机空中巡查服务项目中标结果公示- ...· 11 月前 · |
|
|
任性的野马 · 明月何曾是两乡:傅抱石与金原省吾_手机搜狐网· 1 年前 · |
Answers
What version of Vertica and Kafka do you have?
Are there any logs in Kafka or in Vertica (vertica.log specifically) that show any other errors or possible root cause such as an upstream exception, out of resources, socket closed, etc.?
From Vertica logs:
vertica.log-20200601.gz:2020-06-01 01:22:58.707 Init Session:0x7f030b7fe700-b000000cbe88d4 @node0002 : 55V03/5157: Unavailable: [Txn 0xb000000cbe88d4] O lock table - timeout error Timed out O locking Table:temp.flex. I held by [user verticauser (COPY temp.flex SOURCE KafkaSource(stream='temp.flex|1|-3,temp.flex|2|-3,temp.flex|3|-3,temp.flex|4|-3,temp.flex|0|-3', brokers='****:9092', duration=interval '1000 milliseconds') PARSER KafkaJSONParser(flatten_arrays = false, flatten_maps = false);)]. Your current transaction isolation level is SERIALIZABLE
Upon further looking at the sessions, found that the copy command session is not closed since 9 days...
verticauser=> select * from v_monitor.sessions order by login_timestamp limit 1;
-[ RECORD 1 ]--------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
node_name | node0002
user_name | verticauser
client_hostname | ****:53160
client_pid | 6
login_timestamp | 2020-05-31 03:03:24.13073+00
session_id | node0002-379274:0x28b07
idle_session_timeout |
grace_period |
client_label | jdbc-09.02.0000-c3d34be5-17bb-4e2f-99d5-258db56061d3
transaction_start | 2020-05-31 20:29:26.351612+00
transaction_id | 49539596114782512
transaction_description | user verticauser (COPY temp.flex SOURCE KafkaSource(stream='temp.flex|1|-3,temp.flex|2|-3,temp.flex|3|-3,temp.flex|4|-3,temp.flex|0|-3', brokers='****:9092', duration=interval '1000 milliseconds') PARSER KafkaJSONParser(flatten_arrays = false, flatten_maps = false);)
statement_start | 2020-05-31 20:29:26.363663+00
statement_id | 1
last_statement_duration_us | 11149
runtime_priority | MEDIUM
current_statement | COPY temp.flex SOURCE KafkaSource(stream='temp.flex|1|-3,temp.flex|2|-3,temp.flex|3|-3,temp.flex|4|-3,temp.flex|0|-3', brokers='****:9092', duration=interval '1000 milliseconds') PARSER KafkaJSONParser(flatten_arrays = false, flatten_maps = false);
last_statement | CREATE FLEX TABLE IF NOT EXISTS temp.flex();
ssl_state | None
authentication_method | Password
client_type | JDBC Driver
client_version | 09.02.0000
client_os | Linux 4.15.0-1082-azure amd64
client_os_user_name | root
client_authentication_name | default: Password
client_authentication | 0
requested_protocol | 3.8
effective_protocol | 3.8
external_memory_kb | 0
Please open a support case with following info-
A)Regarding Error 100024 it is a Network Error an error emitted by the Vertica JDBC driver.
Is there any kind of firewall you are using that might be blocking you?
B)For the long running/hung copy which you have mentioned-
a)Tables that are useful here to identify-
1)Sessions table
2)Locks table
Sample queries-
1)select statement_start, node_name, left(current_statement,50) from sessions where current_statement ilike '%kafkasource%'order by statement_start desc;
2)SELECT locks.request_timestamp, locks.grant_timestamp,locks.lock_mode, locks.lock_scope, object_name, substr(locks.transaction_description, 1, 300) AS "left" FROM v_monitor.locks WHERE request_timestamp < sysdate - interval '6 minute' AND object_name not ilike '%.stream_lock' ORDER by request_timestamp;
b)Get Vstacks at the time of hung so that we can match from step A.
Enable Below logging and collect scrutnize -
Enable udx logging in vertica as below, till you get the next hung session.
dbadmin=> select set_debug_log_allnodes('UDX','ALL');
set_debug_log_allnodes
Turned Debug ON for UDX
(1 row)
2) update kafka_conf ‘debug=all’ in source , launch ,cluster and sync tools as below(this will be helpful if we are testing through scheduler ):
a)vkconfig source --update --source test --cluster test_cluster --kafka_conf 'debug=all' --conf ./test.conf
b)vkconfig launch --kafka_conf 'debug=all' --conf ./test.conf &
c)vkconfig sync --kafka_conf 'debug=all' --conf ./test.conf
d)vkconfig cluster --update --cluster kafkacluster --kafka_conf 'debug=all' --validation-type WAR
I have opened a support case for the same.
Meanwhile, one observation that we made during EOFException :
From Vertica logs :
2020-06-10 00:02:01.664 Init Session:0x7f0667bfd700-b000000d368c59 [EE] <INFO> SortManager found maxMerges 2 too small(64 MB Assigned). 2020-06-10 00:02:01.664 Init Session:0x7f0667bfd700-b000000d368c59 [EE] <INFO> After disabling optimization, maxMerges becomes 15. 2020-06-10 00:02:01.790 Init Session:0x7f0667bfd700-b000000d368c59 [EE] <INFO> SortManager found maxMerges 2 too small(64 MB Assigned). 2020-06-10 00:02:01.790 Init Session:0x7f0667bfd700-b000000d368c59 [EE] <INFO> After disabling optimization, maxMerges becomes 15. 2020-06-10 00:02:01.834 Init Session:0x7f0667bfd700-b000000d368c59 [EE] <INFO> SortManager found maxMerges 2 too small(64 MB Assigned). 2020-06-10 00:02:01.834 Init Session:0x7f0667bfd700-b000000d368c59 [EE] <INFO> After disabling optimization, maxMerges becomes 15. 2020-06-10 00:02:01.877 Init Session:0x7f0667bfd700-b000000d368c59 [EE] <INFO> SortManager found maxMerges 2 too small(64 MB Assigned). 2020-06-10 00:02:01.877 Init Session:0x7f0667bfd700-b000000d368c59 [EE] <INFO> After disabling optimization, maxMerges becomes 15. 2020-06-10 00:24:29.373 Init Session:0x7f0667bfd700-b000000d368c59 <LOG> @v_vymo_node0002: 08006/2907: Could not send data to client: No such file or directory 2020-06-10 00:24:29.373 Init Session:0x7f0667bfd700-b000000d368c59 <FATAL> @v_vymo_node0002: 08006/2607: Client has disconnected