[測試] 3950X 2080Ti 9980XE 150k級實驗室運算機

看板 PC_Shopping
作者 fo40225 ()
時間 2020-05-08 18:58:59
留言 7則留言 (5推 0噓 2→)

原本去年的經費存著是要在9月的時候買3950x 結果AMD跳票 3900x也缺貨 只好買一顆6萬的9980xe+一些9900k (結果11月才有3950x 要買的話採購流程也趕不及年底關帳) 現在就來測測看今年是該繼續買兩萬五的3950x還是三萬五的10980XE 短時間之內intel應該也擠不出什麼來 這篇的時效應該能維持一陣子 測試軟體細節可參考 #1UjJiMol (PC_Shopping) === 測試硬體 AMD Ryzen 9 3950X Thermalright Silver Arrow IB-E Extreme ASUS Pro WS X570-ACE 4x Kingston KVR32N22D8/32 2x GIGABYTE RTX2080Ti TURBO 11G (rev. 2.0) MSI NVLink GPU Bridge 3-Slots XPG SX8200Pro 1TB 全漢 CANNON 2000W 全漢 CMT230 炫戰士 (機殼兩個前風扇有上移 從原本兩個風扇吹硬碟電源倉與顯卡 改成下面那個吹顯卡 上面對準m.2) (收到貨才發現技嘉顯卡是rev. 2.0 1.0跟2.0的差異在電源接頭位置 1.0在側邊 桌機來說較好安裝 機殼不夠寬電源線可能會卡到 2.0在後面 應該對機架式相容度較高 但機殼不夠深也是很難裝) Intel Core i9-9980XE Thermalright Silver Arrow IB-E Extreme ASUS WS X299 PRO 8x A-DATA AD4U2666732G19-RGN 2x ASUS TURBO-RTX2080TI-11G Quadro RTX 6000/8000 NVLink HB Bridge 2-Slot ASUS HYPER M.2 X4 MINI CARD └XPG SX8200Pro 1TB 全漢 CANNON 2000W MSI MPG GUNGNIR 100 (這殼的背線空間沒有很寬 前風扇風力沒有很大) BIOS版本與設定 ASUS Pro WS X570-ACE 1302 PBO manual PPT 1000W TDC 1000A EDC 1000A 其餘預設 DDR4-3200 (22-22-22) 1.2V (我懷疑這版本的BIOS PBO是有問題的 測試成績僅供參考 預設p95全核 sse2約3.8GHz avx2約3.4GHz PBO Enable p95avx2瞬間黑畫面 1000/1000/1000 sse2約3.8GHz avx2約3.8GHz 手調200~300A sse2約4.0GHz avx2約3.9GHz Max CPU Boost Clock Override設200MHz會有一堆核心鎖在500MHz) ASUS WS X299 PRO 2002 Long Duration Package Power Limit 4095W Package Power Time Window 127s Short Duration Package Power Limit 4095W CPU Integrated VR Current Limit 1023.875A 前上1風扇測點VRM 前下23風扇測點PCH 後風扇測點PCH 20度C 20% 65度C 70% 70度C 100% 其餘預設 DDR4-2666 (19-19-19) 1.2V 另外使用 nvidia-smi -pm 1 nvidia-smi -pl 280 解除2080ti到280W OS Ubuntu Server 20.04 LTS kernel 5.4.0-26 CUDA driver 440.64 頻率溫度功耗 3950x sensors讀取溫度 turbostat讀取頻率瓦數 9980xe turbostat讀取溫度頻率瓦數 2080ti nvidia-smi讀取溫度頻率瓦數 待機 3950x+2x2080ti CPU 2200MHz 32度C 20W GPU 300MHz 32度C 13W 延長線 95W 9980xe+2x2080ti CPU 1200MHz 34度C 12W GPU 300MHz 35度C 10W 延長線 95W Prime95 Version 29.8 build 6 Small FFTs(L1/L2/L3) 3950x sse2 1秒 CPU 3826MHz 54.5度C 131W 延長線 227W 1分鐘 CPU 3768MHz 62.5度C 125W 延長線 218W https://youtu.be/kDgSxc9guZc
3950x fma3 1秒 CPU 3775MHz 60.3度C 156W 延長線 263W 1分鐘 CPU 3753MHz 72.5度C 161W 延長線 271W https://youtu.be/fZ3C3hk8TCk
9980xe sse2 1秒 CPU 3800MHz 66度C 257W 延長線 418W 1分鐘 CPU 3800MHz 87度C 265W 延長線 430W https://youtu.be/WZj_AQrFpME
9980xe fma3 1秒 CPU 3300MHz 61度C 241W 延長線 388W 1分鐘 CPU 3300MHz 80度C 243W 延長線 395W https://youtu.be/i1VyFFrVi0U
9980xe avx512 1秒 CPU 2800MHz 59度C 210W 延長線 344W 1分鐘 CPU 2800MHz 74度C 208W 延長線 343W https://youtu.be/vKs5G91rL7c
1xGPU tensorflow resnet50 training fp16 batch128 1x2080ti on 3950x 1秒 GPU 1830MHz 48度C 283W 延長線 416W 1分鐘 GPU 1815MHz 68度C 277W 延長線 369W https://youtu.be/T2XE2HlIeLg
1x2080ti on 9980xe 1秒 GPU 1875MHz 52度C 274W 延長線 428W 1分鐘 GPU 1815MHz 78度C 262W 延長線 388W https://youtu.be/pGnW6Am8jaA
p95+2GPU tensorflow 3950x avx2 + 2x2080ti 延長線 796W https://youtu.be/MzaYkBRSAX0
9980xe sse2 + 2x2080ti 延長線 946W https://youtu.be/EAauv9QAHkQ
CPU理論效能測試 ./2006-Core2 //使用SSE2 模擬 一般/普通/傳統/上古遺跡 應用程式 ./2013-Haswell //使用AVX/FMA3 模擬 高度最佳化的現代應用程式 ./2017-SkylakePurley //使用AVX512 Intel的加分題 | 128-bit SSE2 | 256-bit AVX | 256-bit FMA3 | Multiply + Add | Multiply + Add | Fused Multiply Add | 1T | nT | 1T | nT | 1T | nT 3950x| 44.928| 995.664 | 78.912 | 1552.99 | 123.072 | 1791.36 9980xe| 35.04 | 546.144 | 62.016 | 948.384| 123.264 | 1882.37 | 512-bit AVX512 | Fused Multiply Add | 1T | nT 9980xe| 235.008| 3227.14 CPU計算效能測試 |Cholesky|Det |Dot |Fft |Inv |Lu |Qr |Svd 3950x pip | 511.02 | 639.38| 648.55|5.17|433.32|575.97|122.69| 7.22 3950x mkl | 585.48 | 624.04| 247.31|5.29|285.64|536.54|333.59|11.15 debug mkl | 561.61 | 519.77| 626.46|6.40|479.98|454.04|376.93|12.73 9980xe pip | 597.74 | 699.82| 766.01|3.91|483.11|573.80|160.14|11.59 9980xe mkl | 820.29 |1086.11|1355.97|3.74|712.80|749.14|366.21|14.17 IO測試 |3950x |9980xe 1MSeqQ8T1r|2784MB/s |2387MB/s 1MSeqQ8T1w|2867MB/s |2324MB/s 1MSeqQ1T1r|2779MB/s |2405MB/s 1MSeqQ1T1w|2834MB/s |2283MB/s 4kQ32T16r | 697MB/s(170k) | 655MB/s(160k) 4kQ32T16w |1498MB/s(366k) |1492MB/s(364k) 4kQ1T1r |79.7MB/s(19.5k)|65.9MB/s(16.1k) 4kQ1T1w | 234MB/s(57.1k)| 230MB/s(56.2k) (這兩顆SSD都是新的 且都接在CPU上 應該就是Intel漏洞的影響) nvidia-smi topo -m 3950x GPU0 GPU1 CPU Affinity GPU0 X NV2 0-31 GPU1 NV2 X 0-31 9980xe GPU0 GPU1 CPU Affinity GPU0 X NV2 0-35 GPU1 NV2 X 0-35 Legend: X = Self SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI) NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU) PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge) PIX = Connection traversing at most a single PCIe bridge NV# = Connection traversing a bonded set of # NVLinks nvidia-smi topo -mp 3950x GPU0 GPU1 CPU Affinity GPU0 X PHB 0-31 GPU1 PHB X 0-31 9980xe GPU0 GPU1 CPU Affinity GPU0 X SYS 0-35 GPU1 SYS X 0-35 Legend: X = Self SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI) NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU) PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge) PIX = Connection traversing at most a single PCIe bridge p2pBandwidthLatencyTest 3950x Unidirectional P2P=Disabled Bandwidth Matrix (GB/s) D\D 0 1 0 529.77 6.24 1 6.25 531.67 Unidirectional P2P=Enabled Bandwidth (P2P Writes) Matrix (GB/s) D\D 0 1 0 530.74 46.92 1 46.93 531.33 Bidirectional P2P=Disabled Bandwidth Matrix (GB/s) D\D 0 1 0 533.64 11.11 1 11.10 535.07 Bidirectional P2P=Enabled Bandwidth Matrix (GB/s) D\D 0 1 0 533.64 93.47 1 93.68 532.94 P2P=Disabled Latency Matrix (us) GPU 0 1 0 1.90 15.96 1 12.55 1.93 CPU 0 1 0 2.82 7.58 1 7.61 3.00 P2P=Enabled Latency (P2P Writes) Matrix (us) GPU 0 1 0 1.90 2.04 1 2.06 1.94 CPU 0 1 0 3.07 2.50 1 2.51 3.06 9980xe Unidirectional P2P=Disabled Bandwidth Matrix (GB/s) D\D 0 1 0 528.38 11.23 1 11.24 531.12 Unidirectional P2P=Enabled Bandwidth (P2P Writes) Matrix (GB/s) D\D 0 1 0 530.90 46.94 1 46.97 531.39 Bidirectional P2P=Disabled Bandwidth Matrix (GB/s) D\D 0 1 0 535.18 20.01 1 20.07 534.61 Bidirectional P2P=Enabled Bandwidth Matrix (GB/s) D\D 0 1 0 533.96 93.68 1 93.53 532.69 P2P=Disabled Latency Matrix (us) GPU 0 1 0 1.88 15.22 1 13.27 1.83 CPU 0 1 0 2.59 6.90 1 6.93 2.51 P2P=Enabled Latency (P2P Writes) Matrix (us) GPU 0 1 0 1.88 1.77 1 1.75 1.84 CPU 0 1 0 2.73 1.93 1 1.92 2.56 Tensorflow測試 resnet50 1x2080Ti |fp32batch64|fp32batch128|fp16batch64|fp16batch128|fp16batch256 3950x | 266.86 | 240.54 | 669.76 | 683.40 | 566.51 9980xe | 269.42 | 264.58 | 672.30 | 685.81 | 640.76 2x2080ti fp32 | batch32 | batch64 | batch128 | global64 | global128 | global256 3950x | 540.22 | 592.19 | 387.67 9980xe | 541.25 | 597.76 | 486.02 2x2080ti fp16 | batch32 | batch64 | batch128 | batch256 | global64 | global128 | global256 | global512 3950x | 1103.82 | 1333.90 | 1479.71 | 1180.18 9980xe | 1078.67 | 1288.72 | 1400.09 | 1333.67 Pytorch 與 AMP(Apex) 測試 bert | fp32| fp16| 3950x 2x2080ti |00:26.38|00:26.22| 9980xe 2x2080ti |00:29.67|00:34.92| === 看來這個價位(100k~200K) 若經費充足 需要CPU多核數學效能或大容量RAM該買10980xe 四通道記憶體 avx512兩倍輸出 MKL最佳化 不是開玩笑的 RAM大一倍(256GB vs 128GB) 主機板用ASUS WS X299 PRO/SE還可以有內建顯示+IPMI 如果經費不足 購買3900x應該較合理 要雙GPU主機純做DL的話 3600x配x8/x8板+2張二手1080ti應該是最高CP值組合 --
※ 批踢踢實業坊(ptt.cc), 來自: 140.112.16.145 (臺灣)
※ 文章網址: https://www.ptt.cc/bbs/PC_Shopping/M.1588935543.A.AB4.html

a58524andy : 好多篇XD 推測試 05/08 19:10

windrain0317: 這篇再補推 05/08 19:52

tony70017 : 剛查了一下 TR-3990X 64核128緒 原價屋NT:123,456元 05/08 20:15

tony70017 : 抱歉錯頻 05/08 20:15

tpegioe : 05/08 20:21

sdbb : 清流文 05/08 20:46

dreamgirl : 我選2台3900x,跟i那組價格差不多 05/08 21:24

您可能感興趣