实验室服务器简介

服务器架构由一个登陆节点(head1)、15个计算节点(node1~node15)组成。 15个计算节点分属于两个不同的partition:normal和fat.两个partition的组分节点的硬件配置有所不同。相较于normal partition下的节点,fat partition下的节点拥有更大的内存和更多的CPU核心,可执行对性能要求更苛刻的计算任务。(GPU节点is on the way~) 各节点和所属的partition列表如下:

PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
normal*         up   infinite      10  idle node[1-10]
fat             up   infinite      3    mix node[11-13]
fat             up   infinite      2  alloc node[14-15]

各节点的详细配置信息如下:


  NodeName=node1 Arch=x86_64 CoresPerSocket=14 
   CPUAlloc=1 CPUEfctv=28 CPUTot=28 CPULoad=1.06
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=node1 NodeHostName=node1 Version=22.05.3
   OS=Linux 4.18.0-372.9.1.el8.x86_64 #1 SMP Tue May 10 14:48:47 UTC 2022 
   RealMemory=128544 AllocMem=4200 FreeMem=27840 Sockets=2 Boards=1
   State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=normal 
   BootTime=2022-09-24T07:05:29 SlurmdStartTime=2022-09-24T07:06:14
   LastBusyTime=2022-11-14T15:46:22
   CfgTRES=cpu=28,mem=128544M,billing=28
   AllocTRES=cpu=1,mem=4200M
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

NodeName=node2 Arch=x86_64 CoresPerSocket=14 
   CPUAlloc=0 CPUEfctv=28 CPUTot=28 CPULoad=3.00
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=node2 NodeHostName=node2 Version=22.05.3
   OS=Linux 4.18.0-372.9.1.el8.x86_64 #1 SMP Tue May 10 14:48:47 UTC 2022 
   RealMemory=128544 AllocMem=0 FreeMem=33277 Sockets=2 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=normal 
   BootTime=2022-09-29T08:57:38 SlurmdStartTime=2022-09-29T08:58:29
   LastBusyTime=2022-11-14T16:03:51
   CfgTRES=cpu=28,mem=128544M,billing=28
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

NodeName=node3 Arch=x86_64 CoresPerSocket=14 
   CPUAlloc=0 CPUEfctv=28 CPUTot=28 CPULoad=0.00
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=node3 NodeHostName=node3 Version=22.05.3
   OS=Linux 4.18.0-372.9.1.el8.x86_64 #1 SMP Tue May 10 14:48:47 UTC 2022 
   RealMemory=128544 AllocMem=0 FreeMem=84272 Sockets=2 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=normal 
   BootTime=2022-09-29T08:59:31 SlurmdStartTime=2022-09-29T08:58:29
   LastBusyTime=2022-11-14T17:08:49
   CfgTRES=cpu=28,mem=128544M,billing=28
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

NodeName=node4 Arch=x86_64 CoresPerSocket=14 
   CPUAlloc=0 CPUEfctv=28 CPUTot=28 CPULoad=1.00
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=node4 NodeHostName=node4 Version=22.05.3
   OS=Linux 4.18.0-372.9.1.el8.x86_64 #1 SMP Tue May 10 14:48:47 UTC 2022 
   RealMemory=128543 AllocMem=0 FreeMem=77740 Sockets=2 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=normal 
   BootTime=2022-09-22T18:44:43 SlurmdStartTime=2022-09-24T07:04:46
   LastBusyTime=2022-11-11T15:25:46
   CfgTRES=cpu=28,mem=128543M,billing=28
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

NodeName=node5 Arch=x86_64 CoresPerSocket=14 
   CPUAlloc=0 CPUEfctv=28 CPUTot=28 CPULoad=1.00
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=node5 NodeHostName=node5 Version=22.05.3
   OS=Linux 4.18.0-372.9.1.el8.x86_64 #1 SMP Tue May 10 14:48:47 UTC 2022 
   RealMemory=128544 AllocMem=0 FreeMem=126777 Sockets=2 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=normal 
   BootTime=2022-09-22T16:51:12 SlurmdStartTime=2022-09-24T07:04:46
   LastBusyTime=2022-11-11T15:25:46
   CfgTRES=cpu=28,mem=128544M,billing=28
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

NodeName=node6 Arch=x86_64 CoresPerSocket=14 
   CPUAlloc=0 CPUEfctv=28 CPUTot=28 CPULoad=0.01
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=node6 NodeHostName=node6 Version=22.05.3
   OS=Linux 4.18.0-372.9.1.el8.x86_64 #1 SMP Tue May 10 14:48:47 UTC 2022 
   RealMemory=128544 AllocMem=0 FreeMem=120450 Sockets=2 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=normal 
   BootTime=2022-09-22T19:57:42 SlurmdStartTime=2022-09-24T07:04:46
   LastBusyTime=2022-11-11T15:25:41
   CfgTRES=cpu=28,mem=128544M,billing=28
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

NodeName=node7 Arch=x86_64 CoresPerSocket=14 
   CPUAlloc=0 CPUEfctv=28 CPUTot=28 CPULoad=1.00
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=node7 NodeHostName=node7 Version=22.05.3
   OS=Linux 4.18.0-372.9.1.el8.x86_64 #1 SMP Tue May 10 14:48:47 UTC 2022 
   RealMemory=128544 AllocMem=0 FreeMem=123333 Sockets=2 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=normal 
   BootTime=2022-09-22T17:41:52 SlurmdStartTime=2022-09-24T07:04:46
   LastBusyTime=2022-11-11T15:25:41
   CfgTRES=cpu=28,mem=128544M,billing=28
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

NodeName=node8 Arch=x86_64 CoresPerSocket=14 
   CPUAlloc=0 CPUEfctv=28 CPUTot=28 CPULoad=1.01
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=node8 NodeHostName=node8 Version=22.05.3
   OS=Linux 4.18.0-372.9.1.el8.x86_64 #1 SMP Tue May 10 14:48:47 UTC 2022 
   RealMemory=128544 AllocMem=0 FreeMem=99475 Sockets=2 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=normal 
   BootTime=2022-09-22T17:59:52 SlurmdStartTime=2022-09-24T07:04:46
   LastBusyTime=2022-11-11T15:25:37
   CfgTRES=cpu=28,mem=128544M,billing=28
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

NodeName=node9 Arch=x86_64 CoresPerSocket=14 
   CPUAlloc=0 CPUEfctv=28 CPUTot=28 CPULoad=1.00
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=node9 NodeHostName=node9 Version=22.05.3
   OS=Linux 4.18.0-372.9.1.el8.x86_64 #1 SMP Tue May 10 14:48:47 UTC 2022 
   RealMemory=128544 AllocMem=0 FreeMem=1555 Sockets=2 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=normal 
   BootTime=2022-09-22T18:24:08 SlurmdStartTime=2022-09-24T07:04:46
   LastBusyTime=2022-11-11T15:25:43
   CfgTRES=cpu=28,mem=128544M,billing=28
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

NodeName=node10 Arch=x86_64 CoresPerSocket=14 
   CPUAlloc=0 CPUEfctv=28 CPUTot=28 CPULoad=1.00
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=node10 NodeHostName=node10 Version=22.05.3
   OS=Linux 4.18.0-372.9.1.el8.x86_64 #1 SMP Tue May 10 14:48:47 UTC 2022 
   RealMemory=128544 AllocMem=0 FreeMem=2697 Sockets=2 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=normal 
   BootTime=2022-09-22T18:38:09 SlurmdStartTime=2022-09-24T07:04:46
   LastBusyTime=2022-11-11T15:25:43
   CfgTRES=cpu=28,mem=128544M,billing=28
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

NodeName=node11 Arch=x86_64 CoresPerSocket=18 
   CPUAlloc=64 CPUEfctv=72 CPUTot=72 CPULoad=4.19
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=node11 NodeHostName=node11 Version=22.05.3
   OS=Linux 4.18.0-372.9.1.el8.x86_64 #1 SMP Tue May 10 14:48:47 UTC 2022 
   RealMemory=515596 AllocMem=131072 FreeMem=33914 Sockets=4 Boards=1
   State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=fat 
   BootTime=2022-09-22T15:35:14 SlurmdStartTime=2022-09-24T07:04:46
   LastBusyTime=2022-11-12T15:34:59
   CfgTRES=cpu=72,mem=515596M,billing=72
   AllocTRES=cpu=64,mem=128G
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

NodeName=node12 Arch=x86_64 CoresPerSocket=18 
   CPUAlloc=64 CPUEfctv=72 CPUTot=72 CPULoad=3.06
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=node12 NodeHostName=node12 Version=22.05.3
   OS=Linux 4.18.0-372.9.1.el8.x86_64 #1 SMP Tue May 10 14:48:47 UTC 2022 
   RealMemory=515596 AllocMem=131072 FreeMem=200686 Sockets=4 Boards=1
   State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=fat 
   BootTime=2022-09-22T14:22:00 SlurmdStartTime=2022-09-24T07:04:46
   LastBusyTime=2022-11-11T20:13:41
   CfgTRES=cpu=72,mem=515596M,billing=72
   AllocTRES=cpu=64,mem=128G
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

NodeName=node13 Arch=x86_64 CoresPerSocket=18 
   CPUAlloc=0 CPUEfctv=72 CPUTot=72 CPULoad=0.00
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=node13 NodeHostName=node13 Version=22.05.3
   OS=Linux 4.18.0-372.9.1.el8.x86_64 #1 SMP Tue May 10 14:48:47 UTC 2022 
   RealMemory=515596 AllocMem=0 FreeMem=512479 Sockets=4 Boards=1
   State=DOWN ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=fat 
   BootTime=2022-11-14T00:56:39 SlurmdStartTime=2022-11-14T00:56:09
   LastBusyTime=2022-11-14T00:55:09
   CfgTRES=cpu=72,mem=515596M,billing=72
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
   Reason=Node unexpectedly rebooted [slurm@2022-11-14T00:57:08]

NodeName=node14 Arch=x86_64 CoresPerSocket=8 
   CPUAlloc=64 CPUEfctv=64 CPUTot=64 CPULoad=19.24
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=node14 NodeHostName=node14 Version=22.05.3
   OS=Linux 4.18.0-372.9.1.el8.x86_64 #1 SMP Tue May 10 14:48:47 UTC 2022 
   RealMemory=515694 AllocMem=131072 FreeMem=396358 Sockets=4 Boards=1
   State=ALLOCATED ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=fat 
   BootTime=2022-09-22T12:58:01 SlurmdStartTime=2022-09-24T07:04:47
   LastBusyTime=2022-11-09T15:53:49
   CfgTRES=cpu=64,mem=515694M,billing=64
   AllocTRES=cpu=64,mem=128G
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

NodeName=node15 Arch=x86_64 CoresPerSocket=8 
   CPUAlloc=64 CPUEfctv=64 CPUTot=64 CPULoad=10.81
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=node15 NodeHostName=node15 Version=22.05.3
   OS=Linux 4.18.0-372.9.1.el8.x86_64 #1 SMP Tue May 10 14:48:47 UTC 2022 
   RealMemory=515694 AllocMem=131072 FreeMem=471990 Sockets=4 Boards=1
   State=ALLOCATED ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=fat 
   BootTime=2022-09-22T12:58:47 SlurmdStartTime=2022-09-24T07:04:47
   LastBusyTime=2022-11-02T10:30:57
   CfgTRES=cpu=64,mem=515694M,billing=64
   AllocTRES=cpu=64,mem=128G
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s