python /home/admin/mtr/script_for_cron.py -j datou_current3 -m 20 -a ' -a 3318 ' -s datou_3318 -M 0 -S 0 -U 95,95,120
 import MySQLdb succeeded 
Import error (python version)
['/Users/moilerat/Documents/Fotonower/install/caffe/distribute/python', '/home/admin/workarea/git/Velours/python/prod', '/home/admin/workarea/install/caffe_cuda8_python3/python', '/home/admin/workarea/install/darknet', '/home/admin/workarea/git/Velours/python', '/home/admin/workarea/install/caffe_frcnn_python3/py-faster-rcnn/caffe-fast-rcnn/python', '/home/admin/mtr/.credentials', '/home/admin/workarea/install/caffe/python', '/home/admin/workarea/install/caffe_frcnn/py-faster-rcnn/tools', '/home/admin/workarea/git/fotonowerpip', '/home/admin/workarea/install/segment-anything', '/home/admin/workarea/git/pyfvs', '/usr/lib/python38.zip', '/usr/lib/python3.8', '/usr/lib/python3.8/lib-dynload', '/home/admin/.local/lib/python3.8/site-packages', '/usr/local/lib/python3.8/dist-packages', '/usr/lib/python3/dist-packages']
process id : 1329704
load datou : 3318
# VR 17-11-17 : to create in DB !
Here we check the datou graph and we reorder steps !
 Tree builded and cycle checked, now we need to re-order the steps ! 
We have currenlty an error because there is no dependence between the last step for the case tile - detect - glue
We can either keep the depence of, it is better to keep an order compatible with the id of steps if we do not have sons, so a lexical order : (number_son, step_id)
All sons are already in current list !
All sons are already in current list !
All sons are already in current list !
All sons are already in current list !
All sons are already in current list !
All sons are already in current list !
All sons are already in current list !
All sons are already in current list !
All sons are already in current list !
 DONE and to test : checkNoCycle !
Here we check the consistency of inputs/outputs number between the given ones and the db !
eke 1-6-18 : checkConsistencyNbInputNbOutput should be processed after step reordering !
WARNING : number of outputs for step 7928 mask_detect is not consistent : 3 used against 2 in the step definition !
Step 8092 crop_condition have less inputs used (1) than in the step definition (2) : maybe we manage optionnal inputs !
WARNING : number of outputs for step 8092 crop_condition is not consistent : 4 used against 3 in the step definition !
WARNING : number of inputs for step 7933 rle_unique_nms_with_priority is not consistent : 2 used against 1 in the step definition !
WARNING : number of outputs for step 7933 rle_unique_nms_with_priority is not consistent : 2 used against 1 in the step definition !
WARNING : number of outputs for step 7935 ventilate_hashtags_in_portfolio is not consistent : 2 used against 1 in the step definition !
Step 7934 final have less inputs used (2) than in the step definition (3) : maybe we manage optionnal inputs !
Step 7934 final have less outputs used (1) than in the step definition (2) : some outputs may be not used !
WARNING : number of outputs for step 13649 velours_tree is not consistent : 2 used against 1 in the step definition !
Step 9283 split_time_score have less inputs used (1) than in the step definition (2) : maybe we manage optionnal inputs !
Number of inputs / outputs for each step checked !
Here we check the consistency of outputs/inputs types during steps connections
eke 1-6-18 : checkConsistencyTypeOutputInput should be processed after checkConsistencyNbInputNbOutput !
We ignore checkConsistencyTypeOutputInput for datou_step final !
WARNING : type of output 1 of step 7935 doesn't seem to be define in the database(
WARNING : type of input 3 of step 7934 doesn't seem to be define in the database(
We ignore checkConsistencyTypeOutputInput for datou_step final !
WARNING : type of input 1 of step 7935 doesn't seem to be define in the database(
WARNING : output 1 of step 7933 have datatype=7 whereas input 1 of step 7935 have datatype=None
WARNING : type of output 2 of step 7928 doesn't seem to be define in the database(
WARNING : type of input 2 of step 8092 doesn't seem to be define in the database(
WARNING : type of output 3 of step 8092 doesn't seem to be define in the database(
WARNING : type of input 1 of step 7933 doesn't seem to be define in the database(
WARNING : type of output 2 of step 7928 doesn't seem to be define in the database(
WARNING : type of input 1 of step 10917 doesn't seem to be define in the database(
WARNING : type of output 2 of step 7928 doesn't seem to be define in the database(
WARNING : type of input 1 of step 10918 doesn't seem to be define in the database(
We ignore checkConsistencyTypeOutputInput for datou_step final !
WARNING : output 0 of step 7935 have datatype=10 whereas input 3 of step 10916 have datatype=6
WARNING : output 0 of step 7935 have datatype=10 whereas input 0 of step 13649 have datatype=18
WARNING : type of output 1 of step 13649 doesn't seem to be define in the database(
WARNING : type of input 5 of step 10916 doesn't seem to be define in the database(
DataTypes for each output/input checked !
Unexpected type for variable list_input_json
ERROR or WARNING : can't parse json string Expecting value: line 1 column 1 (char 0)
 Tried to parse : 
chemin de la photo was removed should we ?
(photo_id, hashtag_id, score_max) was removed should we ?
[(photo_id, hashtag_id, hashtag_type, x0, x1, y0, y1, score, seg_temp, polygons), ...] was removed should we ?
chemin de la photo was removed should we ?
[ (photo_id_loc, hashtag_id, hashtag_type, x0, x1, y0, y1, score, None), ...] was removed should we ?
chemin de la photo was removed should we ?
id de la photo (peut être local ou global) was removed should we ?
chemin de la photo was removed should we ?
(x0, y0, x1, y1) was removed should we ?
chemin de la photo was removed should we ?
donnée sous forme de texte was removed should we ?
[ (photo_id, photo_id_loc, hashtag_type, x0, x1, y0, y1, score), ...] was removed should we ?
None was removed should we ?
donnée sous forme de texte was removed should we ?
(photo_id, hashtag_id, score_max) was removed should we ?
id de la photo (peut être local ou global) was removed should we ?
donnée sous forme de texte was removed should we ?
donnée sous forme de texte was removed should we ?
donnée sous forme de texte was removed should we ?
chemin de la photo was removed should we ?
(photo_id, hashtag_id, score_max) was removed should we ?
chemin de la photo was removed should we ?
(photo_id, hashtag_id, score_max) was removed should we ?
None was removed should we ?
donnée sous forme de nombre was removed should we ?
(photo_id, hashtag_id, score_max) was removed should we ?
(photo_id, hashtag_id, score_max) was removed should we ?
(photo_id, hashtag_id, score_max) was removed should we ?
(photo_id, hashtag_id, score_max) was removed should we ?
(photo_id, hashtag_id, score_max) was removed should we ?
donnée sous forme de texte was removed should we ?
None was removed should we ?
donnée sous forme de texte was removed should we ?
[ptf_id0,ptf_id1...] was removed should we ?
FOUND : 1
 Here is data_from_sql_as_vec to set the ParamDescriptorType : (5275, 'learn_RUBBIA_REFUS_AMIENS_23', 16384, 25088, 'learn_RUBBIA_REFUS_AMIENS_23', 'pool5', 10.0, None, None, 256, None, 0, None, 8, None, None, -1000.0, 1, datetime.datetime(2021, 4, 23, 14, 19, 39), datetime.datetime(2021, 4, 23, 14, 19, 39))
load thcls
load THCL from format json or kwargs
add thcl : 2847 in CacheModelConfig
load pdts
add pdt : 5275 in CacheModelConfig
Running datou job : batch_current
 TODO datou_current to load to do maybe to take outside batchDatouExec 
updating current state to 1
list_input_json: []
Current got : datou_id : 3318, datou_cur_ids : ['2732571'] with mtr_portfolio_ids : ['22142991'] and first list_photo_ids : []
new path :  /proc/1329704/
Inside batchDatouExec : verbose : 0
# VR 17-11-17 : to create in DB !
Here we check the datou graph and we reorder steps !
 Tree builded and cycle checked, now we need to re-order the steps ! 
We have currenlty an error because there is no dependence between the last step for the case tile - detect - glue
We can either keep the depence of, it is better to keep an order compatible with the id of steps if we do not have sons, so a lexical order : (number_son, step_id)
All sons are already in current list !
All sons are already in current list !
All sons are already in current list !
All sons are already in current list !
All sons are already in current list !
All sons are already in current list !
All sons are already in current list !
All sons are already in current list !
All sons are already in current list !
 DONE and to test : checkNoCycle !
Here we check the consistency of inputs/outputs number between the given ones and the db !
eke 1-6-18 : checkConsistencyNbInputNbOutput should be processed after step reordering !
WARNING : number of outputs for step 7928 mask_detect is not consistent : 3 used against 2 in the step definition !
Step 8092 crop_condition have less inputs used (1) than in the step definition (2) : maybe we manage optionnal inputs !
WARNING : number of outputs for step 8092 crop_condition is not consistent : 4 used against 3 in the step definition !
WARNING : number of inputs for step 7933 rle_unique_nms_with_priority is not consistent : 2 used against 1 in the step definition !
WARNING : number of outputs for step 7933 rle_unique_nms_with_priority is not consistent : 2 used against 1 in the step definition !
WARNING : number of outputs for step 7935 ventilate_hashtags_in_portfolio is not consistent : 2 used against 1 in the step definition !
Step 7934 final have less inputs used (2) than in the step definition (3) : maybe we manage optionnal inputs !
Step 7934 final have less outputs used (1) than in the step definition (2) : some outputs may be not used !
WARNING : number of outputs for step 13649 velours_tree is not consistent : 2 used against 1 in the step definition !
Step 9283 split_time_score have less inputs used (1) than in the step definition (2) : maybe we manage optionnal inputs !
Number of inputs / outputs for each step checked !
Here we check the consistency of outputs/inputs types during steps connections
eke 1-6-18 : checkConsistencyTypeOutputInput should be processed after checkConsistencyNbInputNbOutput !
We ignore checkConsistencyTypeOutputInput for datou_step final !
WARNING : type of output 1 of step 7935 doesn't seem to be define in the database(
WARNING : type of input 3 of step 7934 doesn't seem to be define in the database(
We ignore checkConsistencyTypeOutputInput for datou_step final !
WARNING : type of input 1 of step 7935 doesn't seem to be define in the database(
WARNING : output 1 of step 7933 have datatype=7 whereas input 1 of step 7935 have datatype=None
WARNING : type of output 2 of step 7928 doesn't seem to be define in the database(
WARNING : type of input 2 of step 8092 doesn't seem to be define in the database(
WARNING : type of output 3 of step 8092 doesn't seem to be define in the database(
WARNING : type of input 1 of step 7933 doesn't seem to be define in the database(
WARNING : type of output 2 of step 7928 doesn't seem to be define in the database(
WARNING : type of input 1 of step 10917 doesn't seem to be define in the database(
WARNING : type of output 2 of step 7928 doesn't seem to be define in the database(
WARNING : type of input 1 of step 10918 doesn't seem to be define in the database(
We ignore checkConsistencyTypeOutputInput for datou_step final !
WARNING : output 0 of step 7935 have datatype=10 whereas input 3 of step 10916 have datatype=6
WARNING : output 0 of step 7935 have datatype=10 whereas input 0 of step 13649 have datatype=18
WARNING : type of output 1 of step 13649 doesn't seem to be define in the database(
WARNING : type of input 5 of step 10916 doesn't seem to be define in the database(
DataTypes for each output/input checked !
List Step Type Loaded in datou : mask_detect, crop_condition, rle_unique_nms_with_priority, ventilate_hashtags_in_portfolio, final, blur_detection, brightness, velours_tree, send_mail_cod, split_time_score
over limit max, limiting to limit_max 40
list_input_json : []
origin
We have 1 , 
BFBFBFBFBFBFBFBFBFBFBFBFBFBFBFwe have missing 0 photos in the step downloads : 
photo missing : []
try to delete the photos missing in DB
length of list_filenames : 15 ; length of list_pids : 15 ; length of list_args : 15
time to download the photos : 2.4280552864074707
About to test input to load 
we should then remove the video here, and this would fix the bug of datou_current ! 
Calling datou_exec

Inside datou_exec : verbose : 0
number of steps :  10
step1:mask_detect
Tue Apr  8 13:00:31 2025
VR 17-11-17 : now, only for linear exec dependencies tree, some output goes to fill the input of the next
VR 22-3-18 : now we test the dependencies tree, but keep two separate code for datou_prepare_output_input until the code is correctly tested, clean and works in both case
VR 22-3-18 : but we use the first code for the first step id = -1, build in the code of datou_exec 
VR 22-3-18 : we should manage here the case when we are at the first step instead of building this step before datou_exec
Beginning of datou step mask_detect !
save_polygon : True

 begin detect 

begin to check gpu status 
inside check gpu memory
l 3637 free memory gpu now : 5399  max_wait_temp :  1 max_wait :  0
 gpu_flag :  0
2025-04-08 13:00:34.119484: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2025-04-08 13:00:34.147170: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3493065000 Hz
2025-04-08 13:00:34.149367: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f39f0000b60 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2025-04-08 13:00:34.149427: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2025-04-08 13:00:34.153388: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2025-04-08 13:00:34.318921: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1e019220 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2025-04-08 13:00:34.318974: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA GeForce RTX 2080 Ti, Compute Capability 7.5
2025-04-08 13:00:34.319863: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:41:00.0 name: NVIDIA GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s
2025-04-08 13:00:34.320207: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2025-04-08 13:00:34.322328: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2025-04-08 13:00:34.324459: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2025-04-08 13:00:34.324786: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2025-04-08 13:00:34.327039: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2025-04-08 13:00:34.328177: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2025-04-08 13:00:34.332878: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2025-04-08 13:00:34.334080: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2025-04-08 13:00:34.334172: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2025-04-08 13:00:34.334806: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2025-04-08 13:00:34.334822: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      0 
2025-04-08 13:00:34.334831: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0:   N 
2025-04-08 13:00:34.336931: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4789 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:41:00.0, compute capability: 7.5)
WARNING:tensorflow:From /home/admin/workarea/git/Velours/python/mtr/mask_rcnn/mask_detection.py:69: The name tf.keras.backend.set_session is deprecated. Please use tf.compat.v1.keras.backend.set_session instead.

2025-04-08 13:00:34.654400: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:41:00.0 name: NVIDIA GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s
2025-04-08 13:00:34.654522: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2025-04-08 13:00:34.654556: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2025-04-08 13:00:34.654587: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2025-04-08 13:00:34.654615: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2025-04-08 13:00:34.654644: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2025-04-08 13:00:34.654671: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2025-04-08 13:00:34.654699: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2025-04-08 13:00:34.656217: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2025-04-08 13:00:34.657460: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:41:00.0 name: NVIDIA GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s
2025-04-08 13:00:34.657531: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2025-04-08 13:00:34.657561: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2025-04-08 13:00:34.657589: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2025-04-08 13:00:34.657614: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2025-04-08 13:00:34.657639: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2025-04-08 13:00:34.657664: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2025-04-08 13:00:34.657691: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2025-04-08 13:00:34.658992: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2025-04-08 13:00:34.659031: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2025-04-08 13:00:34.659044: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      0 
2025-04-08 13:00:34.659055: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0:   N 
2025-04-08 13:00:34.660363: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4789 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:41:00.0, compute capability: 7.5)
Using TensorFlow backend.
WARNING:tensorflow:From /home/admin/workarea/install/Mask_RCNN/model.py:396: calling crop_and_resize_v1 (from tensorflow.python.ops.image_ops_impl) with box_ind is deprecated and will be removed in a future version.
Instructions for updating:
box_ind is deprecated, use box_indices instead
WARNING:tensorflow:From /home/admin/workarea/install/Mask_RCNN/model.py:703: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
WARNING:tensorflow:From /home/admin/workarea/install/Mask_RCNN/model.py:729: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
Inside mask_sub_process
Inside mask_detect 
About to load cache.load_thcl_param 
To do loadFromThcl(), then load ParamDescType : thcl2847
 thcls : [{'id': 2847, 'mtr_user_id': 31, 'name': 'learn_RUBBIA_REFUS_AMIENS_23', 'pb_hashtag_id': 0, 'live': b'\x00', 'list_hashtags': 'background,papier,carton,metal,pet_clair,autre,pehd,pet_fonce,environnement', 'svm_portfolios_learning': '0,0,0,0,0,0,0,0,0', 'photo_hashtag_type': 3594, 'photo_desc_type': 5275, 'type_classification': 'mask_rcnn', 'hashtag_id_list': '0,0,0,0,0,0,0,0,0'}]
 thcl {'id': 2847, 'mtr_user_id': 31, 'name': 'learn_RUBBIA_REFUS_AMIENS_23', 'pb_hashtag_id': 0, 'live': b'\x00', 'list_hashtags': 'background,papier,carton,metal,pet_clair,autre,pehd,pet_fonce,environnement', 'svm_portfolios_learning': '0,0,0,0,0,0,0,0,0', 'photo_hashtag_type': 3594, 'photo_desc_type': 5275, 'type_classification': 'mask_rcnn', 'hashtag_id_list': '0,0,0,0,0,0,0,0,0'}
 Update svm_hashtag_type_desc : 5275
FOUND : 1
 Here is data_from_sql_as_vec to set the ParamDescriptorType : (5275, 'learn_RUBBIA_REFUS_AMIENS_23', 16384, 25088, 'learn_RUBBIA_REFUS_AMIENS_23', 'pool5', 10.0, None, None, 256, None, 0, None, 8, None, None, -1000.0, 1, datetime.datetime(2021, 4, 23, 14, 19, 39), datetime.datetime(2021, 4, 23, 14, 19, 39))
{'thcl': {'id': 2847, 'mtr_user_id': 31, 'name': 'learn_RUBBIA_REFUS_AMIENS_23', 'pb_hashtag_id': 0, 'live': b'\x00', 'list_hashtags': 'background,papier,carton,metal,pet_clair,autre,pehd,pet_fonce,environnement', 'svm_portfolios_learning': '0,0,0,0,0,0,0,0,0', 'photo_hashtag_type': 3594, 'photo_desc_type': 5275, 'type_classification': 'mask_rcnn', 'hashtag_id_list': '0,0,0,0,0,0,0,0,0'}, 'list_hashtags': ['background', 'papier', 'carton', 'metal', 'pet_clair', 'autre', 'pehd', 'pet_fonce', 'environnement'], 'list_hashtags_csv': 'background,papier,carton,metal,pet_clair,autre,pehd,pet_fonce,environnement', 'svm_portfolios_learning': '0,0,0,0,0,0,0,0,0', 'photo_hashtag_type': 3594, 'svm_hashtag_type_desc': 5275, 'photo_desc_type': 5275, 'pb_hashtag_id_or_classifier': 0}
list_class_names : 
['background', 'papier', 'carton', 'metal', 'pet_clair', 'autre', 'pehd', 'pet_fonce', 'environnement']

Configurations:
BACKBONE                       resnet101
BACKBONE_SHAPES                [[160 160]
 [ 80  80]
 [ 40  40]
 [ 20  20]
 [ 10  10]]
BACKBONE_STRIDES               [4, 8, 16, 32, 64]
BATCH_SIZE                     1
BBOX_STD_DEV                   [0.1 0.1 0.2 0.2]
DETECTION_MAX_INSTANCES        100
DETECTION_MIN_CONFIDENCE       0.3
DETECTION_NMS_THRESHOLD        0.3
GPU_COUNT                      1
IMAGES_PER_GPU                 1
IMAGE_MAX_DIM                  640
IMAGE_MIN_DIM                  640
IMAGE_PADDING                  True
IMAGE_SHAPE                    [640 640   3]
LEARNING_MOMENTUM              0.9
LEARNING_RATE                  0.001
LOSS_WEIGHTS                   {'rpn_class_loss': 1.0, 'rpn_bbox_loss': 1.0, 'mrcnn_class_loss': 1.0, 'mrcnn_bbox_loss': 1.0, 'mrcnn_mask_loss': 1.0}
MASK_POOL_SIZE                 14
MASK_SHAPE                     [28, 28]
MAX_GT_INSTANCES               100
MEAN_PIXEL                     [123.7 116.8 103.9]
MINI_MASK_SHAPE                (56, 56)
NAME                           learn_RUBBIA_REFUS_AMIENS_23
NUM_CLASSES                    9
POOL_SIZE                      7
POST_NMS_ROIS_INFERENCE        1000
POST_NMS_ROIS_TRAINING         2000
ROI_POSITIVE_RATIO             0.33
RPN_ANCHOR_RATIOS              [0.5, 1, 2]
RPN_ANCHOR_SCALES              (16, 32, 64, 128, 256)
RPN_ANCHOR_STRIDE              1
RPN_BBOX_STD_DEV               [0.1 0.1 0.2 0.2]
RPN_NMS_THRESHOLD              0.7
RPN_TRAIN_ANCHORS_PER_IMAGE    256
STEPS_PER_EPOCH                1000
TRAIN_ROIS_PER_IMAGE           200
USE_MINI_MASK                  True
USE_RPN_ROIS                   True
VALIDATION_STEPS               50
WEIGHT_DECAY                   0.0001


model_param file didn't exist 
model_name : learn_RUBBIA_REFUS_AMIENS_23
model_type : mask_rcnn
list file need :
['mask_model.h5']
file exist in s3 :
['mask_model.h5']
file manque in s3 : 
[]
2025-04-08 13:00:48.717631: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2025-04-08 13:00:48.722672: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2025-04-08 13:00:48.725954: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2025-04-08 13:00:48.737361: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2025-04-08 13:00:48.745503: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2025-04-08 13:00:48.747661: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2025-04-08 13:00:49.579687: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2025-04-08 13:00:49.582050: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2025-04-08 13:00:50.487374: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2025-04-08 13:00:50.489411: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2025-04-08 13:00:51.470765: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2025-04-08 13:00:51.473842: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2025-04-08 13:00:52.877968: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2025-04-08 13:00:52.880489: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2025-04-08 13:00:54.815926: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2025-04-08 13:00:54.818060: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2025-04-08 13:00:56.077416: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2025-04-08 13:00:56.079531: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
local folder  : /data/models_weight/learn_RUBBIA_REFUS_AMIENS_23


/data/models_weight/learn_RUBBIA_REFUS_AMIENS_23/mask_model.h5
size_local : 256009536 size in s3 : 256009536
create time local : 2021-08-09 09:43:22 create time in s3 : 2021-08-06 18:54:04
mask_model.h5 already exist and didn't need to update
 list_images length : 15
NEW PHOTO
Processing 1 images
image                    shape: (2160, 3264, 3)       min:    0.00000  max:  255.00000
molded_images            shape: (1, 640, 640, 3)      min: -123.70000  max:  151.10000
image_metas              shape: (1, 17)               min:    0.00000  max: 3264.00000
error in detect the image : temp/1744110028_1329704_1350595105_092b2dbb95baba16145abbd783380051.jpg
2 root error(s) found.
  (0) Unknown:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node conv1/convolution (defined at /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:3007) ]]
	 [[mrcnn_detection/map/while/LoopCond/_22/_118]]
  (1) Unknown:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node conv1/convolution (defined at /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:3007) ]]
0 successful operations.
0 derived errors ignored. [Op:__inference_keras_scratch_graph_13584]

Function call stack:
keras_scratch_graph -> keras_scratch_graph

NEW PHOTO
Processing 1 images
image                    shape: (2160, 3264, 3)       min:    0.00000  max:  255.00000
molded_images            shape: (1, 640, 640, 3)      min: -123.70000  max:  151.10000
image_metas              shape: (1, 17)               min:    0.00000  max: 3264.00000
error in detect the image : temp/1744110028_1329704_1350595100_20e735cca2195940e00682fa2f3df0d7.jpg
2 root error(s) found.
  (0) Unknown:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node conv1/convolution (defined at /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:3007) ]]
	 [[mrcnn_detection/map/while/LoopCond/_22/_118]]
  (1) Unknown:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node conv1/convolution (defined at /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:3007) ]]
0 successful operations.
0 derived errors ignored. [Op:__inference_keras_scratch_graph_13584]

Function call stack:
keras_scratch_graph -> keras_scratch_graph

NEW PHOTO
Processing 1 images
image                    shape: (2160, 3264, 3)       min:    0.00000  max:  255.00000
molded_images            shape: (1, 640, 640, 3)      min: -123.70000  max:  151.10000
image_metas              shape: (1, 17)               min:    0.00000  max: 3264.00000
error in detect the image : temp/1744110028_1329704_1350595098_3ddbd4ead9d6eb6efe98ccd8da6c9119.jpg
2 root error(s) found.
  (0) Unknown:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node conv1/convolution (defined at /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:3007) ]]
	 [[mrcnn_detection/map/while/LoopCond/_22/_118]]
  (1) Unknown:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node conv1/convolution (defined at /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:3007) ]]
0 successful operations.
0 derived errors ignored. [Op:__inference_keras_scratch_graph_13584]

Function call stack:
keras_scratch_graph -> keras_scratch_graph

NEW PHOTO
Processing 1 images
image                    shape: (2160, 3264, 3)       min:    0.00000  max:  255.00000
molded_images            shape: (1, 640, 640, 3)      min: -123.70000  max:  151.10000
image_metas              shape: (1, 17)               min:    0.00000  max: 3264.00000
error in detect the image : temp/1744110028_1329704_1350595092_f59fedfb4d4b01a2c75b956a84583997.jpg
2 root error(s) found.
  (0) Unknown:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node conv1/convolution (defined at /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:3007) ]]
	 [[mrcnn_detection/map/while/LoopCond/_22/_118]]
  (1) Unknown:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node conv1/convolution (defined at /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:3007) ]]
0 successful operations.
0 derived errors ignored. [Op:__inference_keras_scratch_graph_13584]

Function call stack:
keras_scratch_graph -> keras_scratch_graph

NEW PHOTO
Processing 1 images
image                    shape: (2160, 3264, 3)       min:    0.00000  max:  255.00000
molded_images            shape: (1, 640, 640, 3)      min: -123.70000  max:  151.10000
image_metas              shape: (1, 17)               min:    0.00000  max: 3264.00000
error in detect the image : temp/1744110028_1329704_1350595000_87b8b89c12eb7f5c6294c1ef0ed4b618.jpg
2 root error(s) found.
  (0) Unknown:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node conv1/convolution (defined at /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:3007) ]]
	 [[mrcnn_detection/map/while/LoopCond/_22/_118]]
  (1) Unknown:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node conv1/convolution (defined at /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:3007) ]]
0 successful operations.
0 derived errors ignored. [Op:__inference_keras_scratch_graph_13584]

Function call stack:
keras_scratch_graph -> keras_scratch_graph

NEW PHOTO
Processing 1 images
image                    shape: (2160, 3264, 3)       min:    0.00000  max:  255.00000
molded_images            shape: (1, 640, 640, 3)      min: -123.70000  max:  151.10000
image_metas              shape: (1, 17)               min:    0.00000  max: 3264.00000
error in detect the image : temp/1744110028_1329704_1350594997_d37e34b05205e7c13471f9c52daefc16.jpg
2 root error(s) found.
  (0) Unknown:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node conv1/convolution (defined at /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:3007) ]]
	 [[mrcnn_detection/map/while/LoopCond/_22/_118]]
  (1) Unknown:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node conv1/convolution (defined at /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:3007) ]]
0 successful operations.
0 derived errors ignored. [Op:__inference_keras_scratch_graph_13584]

Function call stack:
keras_scratch_graph -> keras_scratch_graph

NEW PHOTO
Processing 1 images
image                    shape: (2160, 3264, 3)       min:    0.00000  max:  255.00000
molded_images            shape: (1, 640, 640, 3)      min: -123.70000  max:  151.10000
image_metas              shape: (1, 17)               min:    0.00000  max: 3264.00000
error in detect the image : temp/1744110028_1329704_1350594994_fdff88ef7f3abc016e8dca3dc6361c11.jpg
2025-04-08 13:00:57.413337: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2025-04-08 13:00:57.416321: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2025-04-08 13:00:58.469780: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2025-04-08 13:00:58.472598: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2025-04-08 13:01:02.165727: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.166242: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 3.60G (3865470464 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.166700: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 3.24G (3478923264 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.167199: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 2.92G (3131030784 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.167690: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 2.62G (2817927680 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.168178: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 2.36G (2536134912 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.168790: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 2.12G (2282521344 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.168823: W tensorflow/core/common_runtime/bfc_allocator.cc:245] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.06GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2025-04-08 13:01:02.169557: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.169580: W tensorflow/core/common_runtime/bfc_allocator.cc:245] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.06GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2025-04-08 13:01:02.177032: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.177060: W tensorflow/core/common_runtime/bfc_allocator.cc:245] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.06GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2025-04-08 13:01:02.177746: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.177768: W tensorflow/core/common_runtime/bfc_allocator.cc:245] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.06GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2025-04-08 13:01:02.184239: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.184259: W tensorflow/core/common_runtime/bfc_allocator.cc:245] Allocator (GPU_0_bfc) ran out of memory trying to allocate 466.56MiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2025-04-08 13:01:02.184735: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.184750: W tensorflow/core/common_runtime/bfc_allocator.cc:245] Allocator (GPU_0_bfc) ran out of memory trying to allocate 466.56MiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2025-04-08 13:01:02.215274: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.215308: W tensorflow/core/common_runtime/bfc_allocator.cc:245] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.06GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2025-04-08 13:01:02.215814: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.215838: W tensorflow/core/common_runtime/bfc_allocator.cc:245] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.06GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2025-04-08 13:01:02.221544: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.221563: W tensorflow/core/common_runtime/bfc_allocator.cc:245] Allocator (GPU_0_bfc) ran out of memory trying to allocate 243.25MiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2025-04-08 13:01:02.222027: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.222041: W tensorflow/core/common_runtime/bfc_allocator.cc:245] Allocator (GPU_0_bfc) ran out of memory trying to allocate 243.25MiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2025-04-08 13:01:02.255851: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.256334: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.258114: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.258622: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.302137: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.302615: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.304749: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.305256: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.333989: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.334469: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.335958: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.336466: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.341898: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.342373: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.343972: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.344468: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.350032: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.350540: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.351971: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.352444: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.378531: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.379029: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.379536: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.380045: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.383457: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.383966: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.409954: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.410582: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.411212: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.411833: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.424136: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.424654: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.425140: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.425603: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.429825: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.430296: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.434813: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.435335: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.447591: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.448132: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.452233: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.452752: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.453257: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.453764: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.454522: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.455059: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.465671: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.466188: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.466711: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.467236: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.467743: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.468248: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.468754: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.469257: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.478644: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.479197: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.485400: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.485913: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.536852: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.537353: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.537834: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.538304: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.546064: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.546535: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.562335: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.563019: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.563661: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.564296: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.568446: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.568944: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.569449: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.570007: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.571106: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.571125: W tensorflow/core/kernels/gpu_utils.cc:49] Failed to allocate memory for convolution redzone checking; skipping this check. This is benign and only means that we won't check cudnn for out-of-bounds reads and writes. This message will only be printed once.
2025-04-08 13:01:02.581254: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.581738: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.590644: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.591161: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.591646: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.592110: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.592583: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:02.593080: I tensorflow/stream_executor/cuda/cuda_driver.cc:763] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2025-04-08 13:01:26.184786: E tensorflow/stream_executor/dnn.cc:613] CUDNN_STATUS_EXECUTION_FAILED
in tensorflow/stream_executor/cuda/cuda_dnn.cc(3158): 'cudnnConvolutionForward( cudnn.handle(), alpha, input_nd.handle(), input_data.opaque(), filter_nd.handle(), filter_data.opaque(), conv.handle(), ToConvForwardAlgo(algorithm_desc), scratch_memory.opaque(), scratch_memory.size(), beta, output_nd.handle(), output_data.opaque())'
2025-04-08 13:01:27.820563: I tensorflow/stream_executor/stream.cc:1990] [stream=0x1f548050,impl=0x1f546ce0] did not wait for [stream=0x1f547dd0,impl=0x1f546dc0]
2025-04-08 13:01:27.820638: I tensorflow/stream_executor/stream.cc:4938] [stream=0x1f548050,impl=0x1f546ce0] did not memcpy host-to-device; source: 0x37244680
2025-04-08 13:01:27.820746: I tensorflow/stream_executor/stream.cc:1990] [stream=0x1f548050,impl=0x1f546ce0] did not wait for [stream=0x1f547dd0,impl=0x1f546dc0]
2025-04-08 13:01:27.820719: F tensorflow/core/common_runtime/gpu/gpu_util.cc:340] CPU->GPU Memcpy failed
2025-04-08 13:01:27.820771: I tensorflow/stream_executor/stream.cc:4938] [stream=0x1f548050,impl=0x1f546ce0] did not memcpy host-to-device; source: 0x443b2e40
max_time_sub_proc : 3600

Catched exception !
Connect or reconnect !
in case -12
caffe_path_current : 
About to save ! 2
After save, about to update current !
30.30user 34.11system 1:00:08elapsed 1%CPU (0avgtext+0avgdata 4864884maxresident)k
649984inputs+26936outputs (10912major+3573017minor)pagefaults 0swaps