Molecule numbering for lattice site <> and molecule/entity <> is inconsistent.

9 months 1 week ago #3 by Galonso
{Zacros.3.01-Serial (intel-unix) compiled with Cmake}

Dear community:

I get in touch with you to report what I believe it could potentially be a Bugg. I am trying to perform with a simple simulation of CO hydrogenation in a squared lattice with two types of sites: Site "N0" that holds all species, and Site "HrN0" that act as a Hydrogen reservoir and are located in the centers of 4 "N0" sites. The simulation starts making some CO adsorptions (CO + * -> CO*) and H2 dissociative adsorptions (H2 + 2* -> H* + H*). While the coverage is still small, CO start diffusing until it gets close to some H* located in "HrN0" sites. Then it procs the following error:

Internal error code 800001 from zacros_main: infeasible process in serial run!
This is a serious issue. Please notify the developers about this...

By using debugg flags the message now becomes:

Internal error code 802001 from lattice_handle_module: invalid parity of adsorbspecposi and latticestate arrays.
More information:
Molecule numbering for lattice site 123 and molecule/entity 391 is inconsistent.

By printing the surface state at every step I've checked that the move that procs the error is a CO diffusion (It looks like entity 391 is the CO) that moves from a neighbor-less site to the "N0" site 123 that has 2 H neighbors. Up to my knowledge is seems like the move should be legal. I'd like to notice that we found some successful steps with CO adjacent to H, so the error should not come from the CO--H neighboring interaction. Up to this point we only found the error when a CO diffuses close to 2 or 4 H atoms. I attach a PDF file with (probably too much) explanation and figures to illustrate the error and the inputs used and outputs obtained.

Aside from this scenario, we made some additional tests to ensure simulations were properly carried out:

  1) Tested Zacros-3.01 in two different HPCs and a personal computer to ensure its not a compilation/hardware issue 
           We get the same error
  2) Tried to make the lattice smaller to reduce possible input errors in lattice_input.dat
           We get the same error
  3) Tried to make an hexagonal lattice to see if the error had something to do with squared connectivity
           We get the same error
  4) Tried to make the H2 partial pressure zero to see if only CO diffusing (without H) provokes the error
           We don't
  5) Tried to displace the lattice points a small amount to ensure they do not touch the periodic boundary
           We get the same error
  6) Tried to make the same squared cell but alternating two kinds of equivalent sites "N0" and "N1" connected exactly as the previous simulation. We copy all energies and reactions from "N0" to "N1" to ensure both sites are indistinguishable but with different names.  
           We don't get the error anymore.

I'm not sure what may be causing the error, but since our test number 6 is running normally, I believe there could be a bugg somewhere when building the lattice. I'm curious about what did go wrong so I hope this report helps in finding the potential bugg so that we all can learn a bit more about Zacros. 

Thank you very much for your time and assistance


Please Log in or Create an account to join the conversation.

9 months 1 week ago #4 by Galonso
The report was missing in the previous post and the simulation inputs/outputs were duplicated.
I'm trying to upload the report again and sorry for the inconvenience

Please Log in or Create an account to join the conversation.

9 months 3 days ago #5 by Admin
Sorry for your troubles and thanks for reporting this problem. I just got the chance to dig into it and my conclusion is that it is caused by a "silent" array overflow that corrupts the simulation data leading to failure. This can be demonstrated even in the serial version of the code, if you compile it with the following cmake command, which passes certain Fortran flags to enable array-bound checks, as well as tracing back to the line of the code where this happens:

cmake3 ../.. -DCMAKE_BUILD_TYPE=Debug -DCMAKE_Fortran_FLAGS='-g -traceback -O2 -check bounds' -Ddoopenmp=OFF -Ddompi=OFF

Then upon execution you get:

forrtl: severe (408): fort: (2): Subscript #1 of the array SMALLPATTERNSITES has value 21 which is greater than the upper bound of 20 Image              PC                Routine            Line        Source              zacros.x           000000000066810F  Unknown               Unknown  Unknown zacros.x           000000000047E3B0  kmc_simulation_ha        1456  kmc_simulation_handle_module.F90 zacros.x           00000000004790F2  kmc_simulation_ha        1565  kmc_simulation_handle_module.F90 zacros.x           0000000000462013  kmc_simulation_ha        1270  kmc_simulation_handle_module.F90 zacros.x           00000000006265E5  state_propagator_         204  state_propagator_module.F90 zacros.x           0000000000407745  MAIN__                    783  zacros_main.F90 zacros.x           0000000000405962  Unknown               Unknown  Unknown       00007FCC39A52555  __libc_start_main     Unknown  Unknown zacros.x           0000000000405869  Unknown               Unknown  Unknown

Thus, the problem is that a couple of fixed-sized arrays, smallpatternsites and smallpatternstep, get filled-up in your simulation (only the former is mentioned in the above error since it's the one that suffers this fate first during execution). These arrays reside in subroutine add_monodentspecies_related_12siteproc in module kmc_simulation_handle_module, and, as a workaround, you can increase their size from 20 to 80, as shown below (this should be in line 1386 of kmc_simulation_handle_module.F90 in Zacros 3.01):

    integer neighlist(20), nsmallpattern, smallpatternsites(80,2), smallpatternstep(80)

With the above fix, I was able to run your simulation up to the 10000 steps prescribed in your simulation input file.

We will implement memory amortization (dynamic arrays) for the arrays mentioned above to avoid such issues, since there is no way to know a priori how big they should be. Yet, this feature might not be ready in the next release of Zacros, so, in the meantime please try the above workaround and let us know if this fixes your issue.

Please Log in or Create an account to join the conversation.

Time to create page: 0.477 seconds

Sorry, this website uses features that your browser doesn’t support. Upgrade to a newer version of Firefox, Chrome, Safari, or Edge and you’ll be all set.