Zhiding's picture
update
544f248
Sun Feb 9 17:38:13 2025: Configuration file path: shell/lzq/eagle_commercial_llama3_2_1b_eagle_v11_gl_16k_128gpus_ga1.sh
Sun Feb 9 17:38:13 2025: Output directory path: work_dirs/eagle_commercial_llama3_2_1b_eagle_v11_gl_16k_128gpus_ga1
Sun Feb 9 17:38:13 2025: GPUs to use: 128
Sun Feb 9 17:38:13 2025: DEPENDENT_CLONES to use: 7
Sun Feb 9 17:38:13 2025: Use batch short: False
Sun Feb 9 17:38:13 2025: Use container: /home/zhidingy/workspace/dockers/torch_video.sqsh
Sun Feb 9 17:38:13 2025: nnodes to use: 16
Sun Feb 9 17:39:09 2025: monitor jobs in order [2901406, 2901409, 2901411, 2901414, 2901416, 2901418, 2901421, 2901424]
Sun Feb 9 17:39:09 2025: If you want scancel these jobs use this cmd
scancel 2901406 2901409 2901411 2901414 2901416 2901418 2901421 2901424
Sun Feb 9 17:39:09 2025: begin monitor job 2901406
Sun Feb 9 17:39:09 2025: check job: 2901406
Sun Feb 9 17:39:20 2025: job 2901406 PENDING
Sun Feb 9 17:44:20 2025: check job: 2901406
Sun Feb 9 17:44:25 2025: job 2901406 PENDING
Sun Feb 9 17:49:25 2025: check job: 2901406
Sun Feb 9 17:49:37 2025: job 2901406 PENDING
Sun Feb 9 17:54:37 2025: check job: 2901406
Sun Feb 9 17:54:49 2025: job 2901406 PENDING
Sun Feb 9 17:59:49 2025: check job: 2901406
Sun Feb 9 18:00:06 2025: job 2901406 PENDING
Sun Feb 9 18:05:06 2025: check job: 2901406
Sun Feb 9 18:05:19 2025: job 2901406 PENDING
Sun Feb 9 18:10:19 2025: check job: 2901406
Sun Feb 9 18:15:38 2025: check job: 2901406
Sun Feb 9 18:20:54 2025: check job: 2901406
Sun Feb 9 18:26:03 2025: check job: 2901406
Sun Feb 9 18:31:18 2025: check job: 2901406
Sun Feb 9 18:36:35 2025: check job: 2901406
Sun Feb 9 18:41:42 2025: check job: 2901406
Sun Feb 9 18:46:55 2025: check job: 2901406
Sun Feb 9 18:52:10 2025: check job: 2901406
Sun Feb 9 18:57:20 2025: check job: 2901406
Sun Feb 9 19:02:32 2025: check job: 2901406
Sun Feb 9 19:07:46 2025: check job: 2901406
Sun Feb 9 19:12:58 2025: check job: 2901406
Sun Feb 9 19:18:18 2025: check job: 2901406
Sun Feb 9 19:23:33 2025: check job: 2901406
Sun Feb 9 19:28:45 2025: check job: 2901406
Sun Feb 9 19:33:56 2025: check job: 2901406
Sun Feb 9 19:39:08 2025: check job: 2901406
Sun Feb 9 19:44:20 2025: check job: 2901406
Sun Feb 9 19:49:34 2025: check job: 2901406
Sun Feb 9 19:54:46 2025: check job: 2901406
Sun Feb 9 19:59:57 2025: check job: 2901406
Sun Feb 9 20:05:10 2025: check job: 2901406
Sun Feb 9 20:10:22 2025: check job: 2901406
Sun Feb 9 20:15:33 2025: check job: 2901406
Sun Feb 9 20:20:47 2025: check job: 2901406
Sun Feb 9 20:25:58 2025: check job: 2901406
Sun Feb 9 20:31:10 2025: check job: 2901406
Sun Feb 9 20:36:21 2025: check job: 2901406
Sun Feb 9 20:41:33 2025: check job: 2901406
Sun Feb 9 20:46:45 2025: check job: 2901406
Sun Feb 9 20:51:57 2025: check job: 2901406
Sun Feb 9 20:57:10 2025: check job: 2901406
Sun Feb 9 21:02:22 2025: check job: 2901406
Sun Feb 9 21:07:35 2025: check job: 2901406
Sun Feb 9 21:12:48 2025: check job: 2901406
Sun Feb 9 21:17:59 2025: check job: 2901406
Sun Feb 9 21:23:11 2025: check job: 2901406
Sun Feb 9 21:28:22 2025: check job: 2901406
Sun Feb 9 21:33:34 2025: check job: 2901406
Sun Feb 9 21:38:47 2025: check job: 2901406
Sun Feb 9 21:43:59 2025: check job: 2901406
Sun Feb 9 21:49:12 2025: check job: 2901406
Sun Feb 9 21:54:24 2025: check job: 2901406
Sun Feb 9 21:59:36 2025: check job: 2901406
Sun Feb 9 22:04:48 2025: check job: 2901406
Sun Feb 9 22:10:02 2025: check job: 2901406
Sun Feb 9 22:10:13 2025: job 2901406 done
Sun Feb 9 22:10:13 2025: begin monitor job 2901409
Sun Feb 9 22:10:13 2025: check job: 2901409
Sun Feb 9 22:10:17 2025: job 2901409 PENDING
Sun Feb 9 22:15:17 2025: check job: 2901409
Sun Feb 9 22:20:25 2025: check job: 2901409
Sun Feb 9 22:25:37 2025: check job: 2901409
Sun Feb 9 22:30:48 2025: check job: 2901409
Sun Feb 9 22:36:00 2025: check job: 2901409
Sun Feb 9 22:41:12 2025: check job: 2901409
Sun Feb 9 22:46:24 2025: check job: 2901409
Sun Feb 9 22:51:31 2025: check job: 2901409
Sun Feb 9 22:56:42 2025: check job: 2901409
Sun Feb 9 23:01:47 2025: check job: 2901409
Sun Feb 9 23:06:52 2025: check job: 2901409
Sun Feb 9 23:12:06 2025: check job: 2901409
Sun Feb 9 23:17:11 2025: check job: 2901409
Sun Feb 9 23:22:16 2025: check job: 2901409
Sun Feb 9 23:27:26 2025: check job: 2901409
Sun Feb 9 23:32:37 2025: check job: 2901409
Sun Feb 9 23:37:47 2025: check job: 2901409
Sun Feb 9 23:43:02 2025: check job: 2901409
Sun Feb 9 23:48:14 2025: check job: 2901409
Sun Feb 9 23:53:25 2025: check job: 2901409
Sun Feb 9 23:58:39 2025: check job: 2901409
Mon Feb 10 00:03:50 2025: check job: 2901409
Mon Feb 10 00:09:01 2025: check job: 2901409
Mon Feb 10 00:14:14 2025: check job: 2901409
Mon Feb 10 00:19:27 2025: check job: 2901409
Mon Feb 10 00:24:41 2025: check job: 2901409
Mon Feb 10 00:29:54 2025: check job: 2901409
Mon Feb 10 00:35:05 2025: check job: 2901409
Mon Feb 10 00:40:17 2025: check job: 2901409
Mon Feb 10 00:45:29 2025: check job: 2901409
Mon Feb 10 00:50:40 2025: check job: 2901409
Mon Feb 10 00:55:53 2025: check job: 2901409
Mon Feb 10 01:01:04 2025: check job: 2901409
Mon Feb 10 01:06:18 2025: check job: 2901409
Mon Feb 10 01:11:32 2025: check job: 2901409
Mon Feb 10 01:16:44 2025: check job: 2901409
Mon Feb 10 01:21:57 2025: check job: 2901409
Mon Feb 10 01:27:10 2025: check job: 2901409
Mon Feb 10 01:32:22 2025: check job: 2901409
Mon Feb 10 01:37:35 2025: check job: 2901409
Mon Feb 10 01:42:46 2025: check job: 2901409
Mon Feb 10 01:48:06 2025: check job: 2901409
Mon Feb 10 01:53:21 2025: check job: 2901409
Mon Feb 10 01:58:39 2025: check job: 2901409
Mon Feb 10 02:03:55 2025: check job: 2901409
Mon Feb 10 02:09:12 2025: check job: 2901409
Mon Feb 10 02:14:25 2025: check job: 2901409
Mon Feb 10 02:14:39 2025: job 2901409 done
Mon Feb 10 02:14:39 2025: begin monitor job 2901411
Mon Feb 10 02:14:39 2025: check job: 2901411
Mon Feb 10 02:19:46 2025: check job: 2901411
Mon Feb 10 02:25:00 2025: check job: 2901411
Mon Feb 10 02:30:13 2025: check job: 2901411
Mon Feb 10 02:35:29 2025: check job: 2901411
Mon Feb 10 02:40:43 2025: check job: 2901411
Mon Feb 10 02:45:56 2025: check job: 2901411
Mon Feb 10 02:51:10 2025: check job: 2901411
Mon Feb 10 02:56:22 2025: check job: 2901411
Mon Feb 10 03:01:37 2025: check job: 2901411
Mon Feb 10 03:06:51 2025: check job: 2901411
Mon Feb 10 03:12:09 2025: check job: 2901411
Mon Feb 10 03:17:34 2025: check job: 2901411
Mon Feb 10 03:22:54 2025: check job: 2901411
Mon Feb 10 03:28:21 2025: check job: 2901411
Mon Feb 10 03:33:33 2025: check job: 2901411
Mon Feb 10 03:38:51 2025: check job: 2901411
Mon Feb 10 03:44:07 2025: check job: 2901411
Mon Feb 10 03:49:18 2025: check job: 2901411
Mon Feb 10 03:54:36 2025: check job: 2901411
Mon Feb 10 03:59:47 2025: check job: 2901411
Mon Feb 10 04:04:59 2025: check job: 2901411
Mon Feb 10 04:10:23 2025: check job: 2901411
Mon Feb 10 04:15:38 2025: check job: 2901411
Mon Feb 10 04:21:05 2025: check job: 2901411
Mon Feb 10 04:26:16 2025: check job: 2901411
Mon Feb 10 04:31:43 2025: check job: 2901411
Mon Feb 10 04:37:02 2025: check job: 2901411
Mon Feb 10 04:42:22 2025: check job: 2901411
Mon Feb 10 04:47:57 2025: check job: 2901411
Mon Feb 10 04:53:11 2025: check job: 2901411
Mon Feb 10 04:58:23 2025: check job: 2901411
Mon Feb 10 05:03:37 2025: check job: 2901411
Mon Feb 10 05:08:51 2025: check job: 2901411
Mon Feb 10 05:14:28 2025: check job: 2901411
Mon Feb 10 05:14:41 2025: job 2901411 done
Mon Feb 10 05:14:41 2025: work_dirs/eagle_commercial_llama3_2_1b_eagle_v11_gl_16k_128gpus_ga1 finish training
Mon Feb 10 05:14:42 2025: MiaoTiXing: Access successful.
Mon Feb 10 05:14:42 2025: MiaoTiXing: Access successful.
Mon Feb 10 05:14:42 2025: work_dirs/eagle_commercial_llama3_2_1b_eagle_v11_gl_16k_128gpus_ga1 finish training, start auto testing
Mon Feb 10 05:19:11 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:45.883 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:45.884 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:45.889 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:11 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:46.274 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025: NOTICE
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:11 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:11 2025: their original job's log directory structure).
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:11 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:46.274 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:11 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:49.631 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:49.633 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMMU_DEV_VAL_20250210-051449
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:50.713 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMMU_DEV_VAL_20250210-051449:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMMU_DEV_VAL_20250210-051449:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMMU_DEV_VAL_20250210-051449/node_command_MMMU_DEV_VAL_20250210-051449.sh &
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:50.718 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMMU_DEV_VAL_20250210-051449/sbatch_MMMU_DEV_VAL_20250210-051449.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:50.722 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMMU_DEV_VAL_20250210-051449/cluster_submit_command_MMMU_DEV_VAL_20250210-051449.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:51.246 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:11 2025: Submitted batch job 2906989
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:51.246 AM PST][INFO][slurm]: Job Id is 2906989
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:51.607 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:11 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:55.019 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:55.020 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:55.023 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:11 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:55.429 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025: NOTICE
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:11 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:11 2025: their original job's log directory structure).
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:11 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:55.429 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:11 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:58.363 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:58.363 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/ChartQA_TEST_20250210-051458
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:59.283 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/ChartQA_TEST_20250210-051458:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/ChartQA_TEST_20250210-051458:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/ChartQA_TEST_20250210-051458/node_command_ChartQA_TEST_20250210-051458.sh &
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:59.287 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/ChartQA_TEST_20250210-051458/sbatch_ChartQA_TEST_20250210-051458.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:59.292 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/ChartQA_TEST_20250210-051458/cluster_submit_command_ChartQA_TEST_20250210-051458.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:59.843 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:11 2025: Submitted batch job 2906994
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:14:59.843 AM PST][INFO][slurm]: Job Id is 2906994
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:00.212 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:11 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:04.674 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:04.676 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:04.680 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:11 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:05.063 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025: NOTICE
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:11 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:11 2025: their original job's log directory structure).
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:11 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:05.063 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:11 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:08.447 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:08.447 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/DocVQA_VAL_20250210-051508
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:09.399 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/DocVQA_VAL_20250210-051508:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/DocVQA_VAL_20250210-051508:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/DocVQA_VAL_20250210-051508/node_command_DocVQA_VAL_20250210-051508.sh &
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:09.404 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/DocVQA_VAL_20250210-051508/sbatch_DocVQA_VAL_20250210-051508.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:09.409 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/DocVQA_VAL_20250210-051508/cluster_submit_command_DocVQA_VAL_20250210-051508.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:11.263 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:11 2025: Submitted batch job 2907000
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:11.264 AM PST][INFO][slurm]: Job Id is 2907000
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:11.634 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:11 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:14.894 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:14.895 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:14.898 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:11 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:15.287 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025: NOTICE
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:11 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:11 2025: their original job's log directory structure).
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:11 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:15.287 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:11 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:18.751 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:18.751 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMMU_Pro_20250210-051518
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:19.715 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMMU_Pro_20250210-051518:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMMU_Pro_20250210-051518:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMMU_Pro_20250210-051518/node_command_MMMU_Pro_20250210-051518.sh &
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:19.720 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMMU_Pro_20250210-051518/sbatch_MMMU_Pro_20250210-051518.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:19.725 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMMU_Pro_20250210-051518/cluster_submit_command_MMMU_Pro_20250210-051518.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:20.275 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:11 2025: Submitted batch job 2907007
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:20.275 AM PST][INFO][slurm]: Job Id is 2907007
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:20.646 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:11 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:24.127 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:24.128 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:24.131 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:11 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:24.516 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025: NOTICE
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:11 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:11 2025: their original job's log directory structure).
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:11 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:24.516 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:11 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:27.631 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:27.631 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/Video-MME_20250210-051527
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:28.573 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/Video-MME_20250210-051527:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/Video-MME_20250210-051527:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/Video-MME_20250210-051527/node_command_Video-MME_20250210-051527.sh &
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:28.577 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/Video-MME_20250210-051527/sbatch_Video-MME_20250210-051527.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:28.582 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/Video-MME_20250210-051527/cluster_submit_command_Video-MME_20250210-051527.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:29.112 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:11 2025: Submitted batch job 2907015
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:29.112 AM PST][INFO][slurm]: Job Id is 2907015
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:29.476 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:11 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:33.872 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:33.873 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:33.877 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:11 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:34.342 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025: NOTICE
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:11 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:11 2025: their original job's log directory structure).
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:11 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:34.342 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:11 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:40.823 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:40.823 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/HallusionBench_20250210-051540
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:42.558 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/HallusionBench_20250210-051540:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/HallusionBench_20250210-051540:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/HallusionBench_20250210-051540/node_command_HallusionBench_20250210-051540.sh &
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:42.563 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/HallusionBench_20250210-051540/sbatch_HallusionBench_20250210-051540.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:42.568 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/HallusionBench_20250210-051540/cluster_submit_command_HallusionBench_20250210-051540.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:44.410 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:11 2025: Submitted batch job 2907025
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:44.410 AM PST][INFO][slurm]: Job Id is 2907025
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:44.772 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:11 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:48.281 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:48.282 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:48.286 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:11 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:48.674 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025: NOTICE
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:11 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:11 2025: their original job's log directory structure).
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:11 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:48.674 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:11 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:51.875 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:51.875 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/ScienceQA_TEST_20250210-051551
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:52.870 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/ScienceQA_TEST_20250210-051551:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/ScienceQA_TEST_20250210-051551:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/ScienceQA_TEST_20250210-051551/node_command_ScienceQA_TEST_20250210-051551.sh &
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:52.875 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/ScienceQA_TEST_20250210-051551/sbatch_ScienceQA_TEST_20250210-051551.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:52.880 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/ScienceQA_TEST_20250210-051551/cluster_submit_command_ScienceQA_TEST_20250210-051551.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:53.379 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:11 2025: Submitted batch job 2907032
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:53.379 AM PST][INFO][slurm]: Job Id is 2907032
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:53.744 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:11 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:57.268 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:57.269 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:57.273 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:11 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:57.706 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025: NOTICE
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:11 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:11 2025: their original job's log directory structure).
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:11 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:15:57.706 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:11 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:00.655 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:00.655 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MathVista_MINI_20250210-051600
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:01.606 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MathVista_MINI_20250210-051600:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MathVista_MINI_20250210-051600:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MathVista_MINI_20250210-051600/node_command_MathVista_MINI_20250210-051600.sh &
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:01.611 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MathVista_MINI_20250210-051600/sbatch_MathVista_MINI_20250210-051600.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:01.616 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MathVista_MINI_20250210-051600/cluster_submit_command_MathVista_MINI_20250210-051600.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:02.120 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:11 2025: Submitted batch job 2907036
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:02.120 AM PST][INFO][slurm]: Job Id is 2907036
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:02.481 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:11 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:05.851 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:05.852 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:05.856 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:11 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:06.244 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025: NOTICE
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:11 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:11 2025: their original job's log directory structure).
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:11 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:06.244 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:11 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:09.323 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:09.323 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/ChartQA_TEST_20250210-051609
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:10.274 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/ChartQA_TEST_20250210-051609:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/ChartQA_TEST_20250210-051609:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/ChartQA_TEST_20250210-051609/node_command_ChartQA_TEST_20250210-051609.sh &
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:10.279 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/ChartQA_TEST_20250210-051609/sbatch_ChartQA_TEST_20250210-051609.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:10.283 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/ChartQA_TEST_20250210-051609/cluster_submit_command_ChartQA_TEST_20250210-051609.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:10.978 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:11 2025: Submitted batch job 2907044
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:10.978 AM PST][INFO][slurm]: Job Id is 2907044
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:11.343 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:11 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:14.735 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:14.736 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:14.740 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:11 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:15.125 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025: NOTICE
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:11 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:11 2025: their original job's log directory structure).
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:11 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:15.125 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:11 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:18.275 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:18.275 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/TextVQA_VAL_20250210-051618
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:19.216 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/TextVQA_VAL_20250210-051618:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/TextVQA_VAL_20250210-051618:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/TextVQA_VAL_20250210-051618/node_command_TextVQA_VAL_20250210-051618.sh &
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:19.220 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/TextVQA_VAL_20250210-051618/sbatch_TextVQA_VAL_20250210-051618.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:19.225 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/TextVQA_VAL_20250210-051618/cluster_submit_command_TextVQA_VAL_20250210-051618.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:20.857 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:11 2025: Submitted batch job 2907051
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:20.857 AM PST][INFO][slurm]: Job Id is 2907051
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:21.221 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:11 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:24.615 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:24.617 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:24.620 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:11 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:25.000 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025: NOTICE
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:11 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:11 2025: their original job's log directory structure).
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:11 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:25.000 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:11 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:28.107 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:28.107 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/SEEDBench_IMG_20250210-051628
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:29.074 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/SEEDBench_IMG_20250210-051628:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/SEEDBench_IMG_20250210-051628:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/SEEDBench_IMG_20250210-051628/node_command_SEEDBench_IMG_20250210-051628.sh &
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:29.078 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/SEEDBench_IMG_20250210-051628/sbatch_SEEDBench_IMG_20250210-051628.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:29.083 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/SEEDBench_IMG_20250210-051628/cluster_submit_command_SEEDBench_IMG_20250210-051628.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:29.663 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:11 2025: Submitted batch job 2907057
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:29.664 AM PST][INFO][slurm]: Job Id is 2907057
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:30.664 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:11 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:34.563 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:34.565 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:34.570 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:11 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:35.201 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025: NOTICE
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:11 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:11 2025: their original job's log directory structure).
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:11 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:35.201 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:11 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:38.423 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:38.425 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMBench_DEV_EN_V11_20250210-051638
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:39.499 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMBench_DEV_EN_V11_20250210-051638:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMBench_DEV_EN_V11_20250210-051638:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMBench_DEV_EN_V11_20250210-051638/node_command_MMBench_DEV_EN_V11_20250210-051638.sh &
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:39.503 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMBench_DEV_EN_V11_20250210-051638/sbatch_MMBench_DEV_EN_V11_20250210-051638.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:39.508 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMBench_DEV_EN_V11_20250210-051638/cluster_submit_command_MMBench_DEV_EN_V11_20250210-051638.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:40.021 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:11 2025: Submitted batch job 2907068
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:40.022 AM PST][INFO][slurm]: Job Id is 2907068
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:40.385 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:11 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:43.795 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:43.796 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:43.800 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:11 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:44.193 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025: NOTICE
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:11 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:11 2025: their original job's log directory structure).
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:11 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:44.194 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:11 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:47.467 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:47.467 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMBench_DEV_CN_V11_20250210-051647
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:48.440 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMBench_DEV_CN_V11_20250210-051647:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMBench_DEV_CN_V11_20250210-051647:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMBench_DEV_CN_V11_20250210-051647/node_command_MMBench_DEV_CN_V11_20250210-051647.sh &
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:48.445 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMBench_DEV_CN_V11_20250210-051647/sbatch_MMBench_DEV_CN_V11_20250210-051647.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:48.450 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMBench_DEV_CN_V11_20250210-051647/cluster_submit_command_MMBench_DEV_CN_V11_20250210-051647.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:48.978 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:11 2025: Submitted batch job 2907073
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:48.979 AM PST][INFO][slurm]: Job Id is 2907073
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:49.343 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:11 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:52.776 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:52.777 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:52.781 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:11 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:53.300 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025: NOTICE
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:11 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:11 2025: their original job's log directory structure).
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:11 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:53.300 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:11 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:56.835 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:56.835 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMMU_DEV_VAL_20250210-051656
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:57.812 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMMU_DEV_VAL_20250210-051656:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMMU_DEV_VAL_20250210-051656:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMMU_DEV_VAL_20250210-051656/node_command_MMMU_DEV_VAL_20250210-051656.sh &
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:57.817 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMMU_DEV_VAL_20250210-051656/sbatch_MMMU_DEV_VAL_20250210-051656.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:57.822 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMMU_DEV_VAL_20250210-051656/cluster_submit_command_MMMU_DEV_VAL_20250210-051656.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:59.547 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:11 2025: Submitted batch job 2907084
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:59.547 AM PST][INFO][slurm]: Job Id is 2907084
Mon Feb 10 05:19:11 2025: [2025-02-10 05:16:59.907 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:11 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:11 2025: [2025-02-10 05:17:03.550 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:11 2025: [2025-02-10 05:17:03.551 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:11 2025: [2025-02-10 05:17:03.555 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:11 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:17:04.018 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025: NOTICE
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:11 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:11 2025: their original job's log directory structure).
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:11 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:17:04.018 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:11 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:17:06.979 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:11 2025: [2025-02-10 05:17:06.979 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/AI2D_TEST_20250210-051706
Mon Feb 10 05:19:11 2025: [2025-02-10 05:17:07.931 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/AI2D_TEST_20250210-051706:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/AI2D_TEST_20250210-051706:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/AI2D_TEST_20250210-051706/node_command_AI2D_TEST_20250210-051706.sh &
Mon Feb 10 05:19:11 2025: [2025-02-10 05:17:07.936 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/AI2D_TEST_20250210-051706/sbatch_AI2D_TEST_20250210-051706.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:17:07.940 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/AI2D_TEST_20250210-051706/cluster_submit_command_AI2D_TEST_20250210-051706.sh
Mon Feb 10 05:19:11 2025: [2025-02-10 05:17:08.424 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:11 2025: Submitted batch job 2907089
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:17:08.424 AM PST][INFO][slurm]: Job Id is 2907089
Mon Feb 10 05:19:11 2025: [2025-02-10 05:17:08.797 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:11 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:11 2025: [2025-02-10 05:17:12.274 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:11 2025: [2025-02-10 05:17:12.275 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:11 2025: [2025-02-10 05:17:12.278 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:11 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:11 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: [2025-02-10 05:17:12.660 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025: NOTICE
Mon Feb 10 05:19:11 2025: ================
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:11 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:11 2025: their original job's log directory structure).
Mon Feb 10 05:19:11 2025:
Mon Feb 10 05:19:11 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:12 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:12.660 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:12 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:15.567 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:15.567 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/InfoVQA_VAL_20250210-051715
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:16.576 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/InfoVQA_VAL_20250210-051715:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/InfoVQA_VAL_20250210-051715:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/InfoVQA_VAL_20250210-051715/node_command_InfoVQA_VAL_20250210-051715.sh &
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:16.581 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/InfoVQA_VAL_20250210-051715/sbatch_InfoVQA_VAL_20250210-051715.sh
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:16.586 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/InfoVQA_VAL_20250210-051715/cluster_submit_command_InfoVQA_VAL_20250210-051715.sh
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:17.352 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:12 2025: Submitted batch job 2907096
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:17.352 AM PST][INFO][slurm]: Job Id is 2907096
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:17.742 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:12 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:21.110 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:21.111 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:21.115 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:12 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:12 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:12 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:21.496 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:12 2025: ================
Mon Feb 10 05:19:12 2025: NOTICE
Mon Feb 10 05:19:12 2025: ================
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:12 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:12 2025: their original job's log directory structure).
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:12 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:21.496 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:12 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:24.423 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:24.423 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/OCRBench_20250210-051724
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:25.367 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/OCRBench_20250210-051724:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/OCRBench_20250210-051724:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/OCRBench_20250210-051724/node_command_OCRBench_20250210-051724.sh &
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:25.372 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/OCRBench_20250210-051724/sbatch_OCRBench_20250210-051724.sh
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:25.376 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/OCRBench_20250210-051724/cluster_submit_command_OCRBench_20250210-051724.sh
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:26.053 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:12 2025: Submitted batch job 2907101
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:26.054 AM PST][INFO][slurm]: Job Id is 2907101
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:26.438 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:12 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:31.022 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:31.023 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:31.028 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:12 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:12 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:12 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:31.413 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:12 2025: ================
Mon Feb 10 05:19:12 2025: NOTICE
Mon Feb 10 05:19:12 2025: ================
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:12 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:12 2025: their original job's log directory structure).
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:12 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:31.413 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:12 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:34.431 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:34.433 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/RealWorldQA_20250210-051734
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:35.504 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/RealWorldQA_20250210-051734:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/RealWorldQA_20250210-051734:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/RealWorldQA_20250210-051734/node_command_RealWorldQA_20250210-051734.sh &
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:35.508 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/RealWorldQA_20250210-051734/sbatch_RealWorldQA_20250210-051734.sh
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:35.513 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/RealWorldQA_20250210-051734/cluster_submit_command_RealWorldQA_20250210-051734.sh
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:36.243 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:12 2025: Submitted batch job 2907110
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:36.244 AM PST][INFO][slurm]: Job Id is 2907110
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:36.610 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:12 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:40.410 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:40.411 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:40.415 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:12 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:12 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:12 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:40.801 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:12 2025: ================
Mon Feb 10 05:19:12 2025: NOTICE
Mon Feb 10 05:19:12 2025: ================
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:12 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:12 2025: their original job's log directory structure).
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:12 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:40.801 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:12 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:44.015 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:44.015 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/POPE_20250210-051744
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:45.354 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/POPE_20250210-051744:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/POPE_20250210-051744:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/POPE_20250210-051744/node_command_POPE_20250210-051744.sh &
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:45.358 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/POPE_20250210-051744/sbatch_POPE_20250210-051744.sh
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:45.363 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/POPE_20250210-051744/cluster_submit_command_POPE_20250210-051744.sh
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:45.883 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:12 2025: Submitted batch job 2907121
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:45.883 AM PST][INFO][slurm]: Job Id is 2907121
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:46.250 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:12 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:52.179 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:52.180 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:52.184 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:12 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:12 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:12 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:52.697 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:12 2025: ================
Mon Feb 10 05:19:12 2025: NOTICE
Mon Feb 10 05:19:12 2025: ================
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:12 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:12 2025: their original job's log directory structure).
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:12 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:52.697 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:12 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:58.151 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:58.151 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMVet_20250210-051758
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:59.238 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMVet_20250210-051758:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMVet_20250210-051758:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMVet_20250210-051758/node_command_MMVet_20250210-051758.sh &
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:59.243 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMVet_20250210-051758/sbatch_MMVet_20250210-051758.sh
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:59.248 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMVet_20250210-051758/cluster_submit_command_MMVet_20250210-051758.sh
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:59.768 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:12 2025: Submitted batch job 2907128
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:17:59.768 AM PST][INFO][slurm]: Job Id is 2907128
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:00.133 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:12 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:03.984 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:03.985 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:03.989 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:12 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:12 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:12 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:04.397 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:12 2025: ================
Mon Feb 10 05:19:12 2025: NOTICE
Mon Feb 10 05:19:12 2025: ================
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:12 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:12 2025: their original job's log directory structure).
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:12 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:04.398 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:12 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:07.951 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:07.952 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MME_20250210-051807
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:09.060 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MME_20250210-051807:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MME_20250210-051807:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MME_20250210-051807/node_command_MME_20250210-051807.sh &
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:09.065 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MME_20250210-051807/sbatch_MME_20250210-051807.sh
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:09.070 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MME_20250210-051807/cluster_submit_command_MME_20250210-051807.sh
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:10.221 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:12 2025: Submitted batch job 2907134
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:10.221 AM PST][INFO][slurm]: Job Id is 2907134
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:10.608 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:12 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:16.674 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:16.675 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:16.679 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:12 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:12 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:12 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:17.143 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:12 2025: ================
Mon Feb 10 05:19:12 2025: NOTICE
Mon Feb 10 05:19:12 2025: ================
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:12 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:12 2025: their original job's log directory structure).
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:12 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:17.143 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:12 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:20.911 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:20.911 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMStar_20250210-051820
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:21.868 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMStar_20250210-051820:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMStar_20250210-051820:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMStar_20250210-051820/node_command_MMStar_20250210-051820.sh &
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:21.873 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMStar_20250210-051820/sbatch_MMStar_20250210-051820.sh
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:21.877 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMStar_20250210-051820/cluster_submit_command_MMStar_20250210-051820.sh
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:22.576 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:12 2025: Submitted batch job 2907147
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:22.577 AM PST][INFO][slurm]: Job Id is 2907147
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:22.940 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:12 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:26.281 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:26.282 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:26.285 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:12 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:12 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:12 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:26.671 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:12 2025: ================
Mon Feb 10 05:19:12 2025: NOTICE
Mon Feb 10 05:19:12 2025: ================
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:12 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:12 2025: their original job's log directory structure).
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:12 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:26.671 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:12 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:29.759 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:29.759 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/CCBench_20250210-051829
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:30.704 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/CCBench_20250210-051829:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/CCBench_20250210-051829:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/CCBench_20250210-051829/node_command_CCBench_20250210-051829.sh &
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:30.709 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/CCBench_20250210-051829/sbatch_CCBench_20250210-051829.sh
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:30.714 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/CCBench_20250210-051829/cluster_submit_command_CCBench_20250210-051829.sh
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:31.371 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:12 2025: Submitted batch job 2907155
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:31.371 AM PST][INFO][slurm]: Job Id is 2907155
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:31.739 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:12 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:35.057 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:35.058 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:35.062 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:12 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:12 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:12 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:35.518 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:12 2025: ================
Mon Feb 10 05:19:12 2025: NOTICE
Mon Feb 10 05:19:12 2025: ================
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:12 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:12 2025: their original job's log directory structure).
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:12 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:35.518 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:12 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:38.771 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:38.771 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMBench_TEST_CN_V11_20250210-051838
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:39.709 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMBench_TEST_CN_V11_20250210-051838:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMBench_TEST_CN_V11_20250210-051838:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMBench_TEST_CN_V11_20250210-051838/node_command_MMBench_TEST_CN_V11_20250210-051838.sh &
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:39.714 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMBench_TEST_CN_V11_20250210-051838/sbatch_MMBench_TEST_CN_V11_20250210-051838.sh
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:39.718 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMBench_TEST_CN_V11_20250210-051838/cluster_submit_command_MMBench_TEST_CN_V11_20250210-051838.sh
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:40.399 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:12 2025: Submitted batch job 2907160
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:40.399 AM PST][INFO][slurm]: Job Id is 2907160
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:40.760 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:12 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:44.042 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:44.044 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:44.047 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:12 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:12 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:12 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:44.442 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:12 2025: ================
Mon Feb 10 05:19:12 2025: NOTICE
Mon Feb 10 05:19:12 2025: ================
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:12 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:12 2025: their original job's log directory structure).
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:12 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:44.443 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:12 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:47.391 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:47.391 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMBench_TEST_EN_V11_20250210-051847
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:48.560 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMBench_TEST_EN_V11_20250210-051847:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMBench_TEST_EN_V11_20250210-051847:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMBench_TEST_EN_V11_20250210-051847/node_command_MMBench_TEST_EN_V11_20250210-051847.sh &
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:48.565 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMBench_TEST_EN_V11_20250210-051847/sbatch_MMBench_TEST_EN_V11_20250210-051847.sh
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:48.569 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/MMBench_TEST_EN_V11_20250210-051847/cluster_submit_command_MMBench_TEST_EN_V11_20250210-051847.sh
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:50.734 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:12 2025: Submitted batch job 2907164
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:50.734 AM PST][INFO][slurm]: Job Id is 2907164
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:51.101 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:12 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:54.547 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:54.548 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:54.552 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:12 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:12 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:12 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:55.526 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:12 2025: ================
Mon Feb 10 05:19:12 2025: NOTICE
Mon Feb 10 05:19:12 2025: ================
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:12 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:12 2025: their original job's log directory structure).
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:12 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:55.526 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:12 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:58.499 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:58.499 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/DocVQA_TEST_20250210-051858
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:59.443 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/DocVQA_TEST_20250210-051858:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/DocVQA_TEST_20250210-051858:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/DocVQA_TEST_20250210-051858/node_command_DocVQA_TEST_20250210-051858.sh &
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:59.448 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/DocVQA_TEST_20250210-051858/sbatch_DocVQA_TEST_20250210-051858.sh
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:59.452 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/DocVQA_TEST_20250210-051858/cluster_submit_command_DocVQA_TEST_20250210-051858.sh
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:59.973 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:12 2025: Submitted batch job 2907173
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:18:59.973 AM PST][INFO][slurm]: Job Id is 2907173
Mon Feb 10 05:19:12 2025: [2025-02-10 05:19:00.339 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.
Mon Feb 10 05:19:12 2025: CLUSTER=NSS CLUSTER_STACK=NSS CLUSTER_NAME=DRACO_OCI_IAD subdir=nss
Mon Feb 10 05:19:12 2025: [2025-02-10 05:19:03.744 AM PST][INFO][load_config]: Loading NSS config: /home/adlr/adlr-utils/release/cluster-interface/latest/nss/config-draco-oci.json
Mon Feb 10 05:19:12 2025: [2025-02-10 05:19:03.746 AM PST][INFO][load_config]: Overriding from env: NSS_ADLR_PYTHON=NSSSUB_ADLR_UTILS_ENV_ROOT/python/latest/bin/python
Mon Feb 10 05:19:12 2025: [2025-02-10 05:19:03.749 AM PST][WARNING][submit_slurm_parent]:
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:12 2025: This host seems to be a data copier or container build node, which usually will not be set up to submit jobs.
Mon Feb 10 05:19:12 2025: Did you possibly mean to do it from a login node instead?
Mon Feb 10 05:19:12 2025: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:19:04.132 AM PST][WARNING][submit_job]:
Mon Feb 10 05:19:12 2025: ================
Mon Feb 10 05:19:12 2025: NOTICE
Mon Feb 10 05:19:12 2025: ================
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: The log directory structure will be changing in an upcoming release.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: Please use the temporary `--preview_new_logdir` option to try it out with your jobs beforehand.
Mon Feb 10 05:19:12 2025: After the preview period, the new structure will be used for all new jobs (except autoresume follow-ups, which will keep
Mon Feb 10 05:19:12 2025: their original job's log directory structure).
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: Please be advised that some file locations may change due to the new structure,
Mon Feb 10 05:19:12 2025: but user code should be unaffected in most cases.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: If you encounter any issues or have feedback, please reach out to `@adlr-support` in Slack.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:19:04.132 AM PST][WARNING][submit_job]: `--autoresume_method` is deprecated and will be removed in a future release, when all follow-ups use the requeue method.
Mon Feb 10 05:19:12 2025: Please reach out to `@adlr-support` in Slack if you rely on it and need to discuss options.
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:19:07.375 AM PST][INFO][submit_slurm_parent]: Forcing exclusive mode
Mon Feb 10 05:19:12 2025: [2025-02-10 05:19:07.375 AM PST][INFO][submit_job]: Creating the logdir: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/InfoVQA_TEST_20250210-051907
Mon Feb 10 05:19:12 2025: [2025-02-10 05:19:08.312 AM PST][INFO][submit_slurm_parent]: srun_commands=srun --kill-on-bad-exit=1 --container-image=/home/zhidingy/workspace/eagle2/torch2_test.sqsh --container-mounts=/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/python:ro,/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:/lustre/fs11/portfolios/adlr/projects/adlr_other_infra/release/cluster-interface/13.11_2025-02-05_11-20-02:ro,/home/adlr/adlr-utils/release/cluster-interface/latest:/home/adlr/adlr-utils/release/cluster-interface/latest:ro,/dev/fuse:/dev/fuse:rw,/home/zhidingy:/home/zhidingy:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/InfoVQA_TEST_20250210-051907:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/InfoVQA_TEST_20250210-051907:rw,/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:/lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval:rw,/home/:/home/:rw,/lustre:/lustre:rw /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/InfoVQA_TEST_20250210-051907/node_command_InfoVQA_TEST_20250210-051907.sh &
Mon Feb 10 05:19:12 2025: [2025-02-10 05:19:08.317 AM PST][INFO][submit_job]: Details of submit command: sbatch /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/InfoVQA_TEST_20250210-051907/sbatch_InfoVQA_TEST_20250210-051907.sh
Mon Feb 10 05:19:12 2025: [2025-02-10 05:19:08.321 AM PST][INFO][utils]: Executing command: /lustre/fs12/portfolios/llmservice/users/zhidingy/vlmeval/work_dirs/eval/Eagle-Next/InfoVQA_TEST_20250210-051907/cluster_submit_command_InfoVQA_TEST_20250210-051907.sh
Mon Feb 10 05:19:12 2025: [2025-02-10 05:19:08.857 AM PST][INFO][utils]: Stdout:
Mon Feb 10 05:19:12 2025: Submitted batch job 2907181
Mon Feb 10 05:19:12 2025:
Mon Feb 10 05:19:12 2025: [2025-02-10 05:19:08.858 AM PST][INFO][slurm]: Job Id is 2907181
Mon Feb 10 05:19:12 2025: [2025-02-10 05:19:09.267 AM PST][INFO][submit_job]: Non blocking execution - job has been submitted.