checkpoint_techniques_on_compute_canada_clusters

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
checkpoint_techniques_on_compute_canada_clusters [2015/04/17 15:29] – [Checkpointing techniques on guillimin] 132.216.122.26checkpoint_techniques_on_compute_canada_clusters [2024/03/26 13:52] (current) – external edit 127.0.0.1
Line 79: Line 79:
  
 # New version of this script. Now we use DMTCP to launch # New version of this script. Now we use DMTCP to launch
-# the scripts (and gnu-parallel).+# the scripts.
  
 def chunks(l, n): def chunks(l, n):
Line 169: Line 169:
  
 **Currently this is not working as expected; for some unknown reason, only 2 random jobs get re-started. I have contacted Calcul Québec about this and they should reply shortly. I will update this page with a bug-free script (or whatever solution they give me.)**  **Currently this is not working as expected; for some unknown reason, only 2 random jobs get re-started. I have contacted Calcul Québec about this and they should reply shortly. I will update this page with a bug-free script (or whatever solution they give me.)** 
 +
 +**Update 2: they did not reply.**
  • checkpoint_techniques_on_compute_canada_clusters.1429284543.txt.gz
  • Last modified: 2024/03/26 13:52
  • (external edit)