- Posts: 2
- Thank you received: 0
Molecule numbering for lattice site <> and molecule/entity <> is inconsistent.
- Galonso
- Topic Author
- Offline
- User
Less
More
1 year 4 months ago #3
by Galonso
Molecule numbering for lattice site <> and molecule/entity <> is inconsistent. was created by Galonso
{Zacros.3.01-Serial (intel-unix) compiled with Cmake}
Dear community:
I get in touch with you to report what I believe it could potentially be a Bugg. I am trying to perform with a simple simulation of CO hydrogenation in a squared lattice with two types of sites: Site "N0" that holds all species, and Site "HrN0" that act as a Hydrogen reservoir and are located in the centers of 4 "N0" sites. The simulation starts making some CO adsorptions (CO + * -> CO*) and H2 dissociative adsorptions (H2 + 2* -> H* + H*). While the coverage is still small, CO start diffusing until it gets close to some H* located in "HrN0" sites. Then it procs the following error:
Internal error code 800001 from zacros_main: infeasible process in serial run!
This is a serious issue. Please notify the developers about this...
By using debugg flags the message now becomes:
Internal error code 802001 from lattice_handle_module: invalid parity of adsorbspecposi and latticestate arrays.
More information:
Molecule numbering for lattice site 123 and molecule/entity 391 is inconsistent.
By printing the surface state at every step I've checked that the move that procs the error is a CO diffusion (It looks like entity 391 is the CO) that moves from a neighbor-less site to the "N0" site 123 that has 2 H neighbors. Up to my knowledge is seems like the move should be legal. I'd like to notice that we found some successful steps with CO adjacent to H, so the error should not come from the CO--H neighboring interaction. Up to this point we only found the error when a CO diffuses close to 2 or 4 H atoms. I attach a PDF file with (probably too much) explanation and figures to illustrate the error and the inputs used and outputs obtained.
Aside from this scenario, we made some additional tests to ensure simulations were properly carried out:
1) Tested Zacros-3.01 in two different HPCs and a personal computer to ensure its not a compilation/hardware issue
We get the same error
2) Tried to make the lattice smaller to reduce possible input errors in lattice_input.dat
We get the same error
3) Tried to make an hexagonal lattice to see if the error had something to do with squared connectivity
We get the same error
4) Tried to make the H2 partial pressure zero to see if only CO diffusing (without H) provokes the error
We don't
5) Tried to displace the lattice points a small amount to ensure they do not touch the periodic boundary
We get the same error
6) Tried to make the same squared cell but alternating two kinds of equivalent sites "N0" and "N1" connected exactly as the previous simulation. We copy all energies and reactions from "N0" to "N1" to ensure both sites are indistinguishable but with different names.
We don't get the error anymore.
I'm not sure what may be causing the error, but since our test number 6 is running normally, I believe there could be a bugg somewhere when building the lattice. I'm curious about what did go wrong so I hope this report helps in finding the potential bugg so that we all can learn a bit more about Zacros.
Thank you very much for your time and assistance
Dear community:
I get in touch with you to report what I believe it could potentially be a Bugg. I am trying to perform with a simple simulation of CO hydrogenation in a squared lattice with two types of sites: Site "N0" that holds all species, and Site "HrN0" that act as a Hydrogen reservoir and are located in the centers of 4 "N0" sites. The simulation starts making some CO adsorptions (CO + * -> CO*) and H2 dissociative adsorptions (H2 + 2* -> H* + H*). While the coverage is still small, CO start diffusing until it gets close to some H* located in "HrN0" sites. Then it procs the following error:
Internal error code 800001 from zacros_main: infeasible process in serial run!
This is a serious issue. Please notify the developers about this...
By using debugg flags the message now becomes:
Internal error code 802001 from lattice_handle_module: invalid parity of adsorbspecposi and latticestate arrays.
More information:
Molecule numbering for lattice site 123 and molecule/entity 391 is inconsistent.
By printing the surface state at every step I've checked that the move that procs the error is a CO diffusion (It looks like entity 391 is the CO) that moves from a neighbor-less site to the "N0" site 123 that has 2 H neighbors. Up to my knowledge is seems like the move should be legal. I'd like to notice that we found some successful steps with CO adjacent to H, so the error should not come from the CO--H neighboring interaction. Up to this point we only found the error when a CO diffuses close to 2 or 4 H atoms. I attach a PDF file with (probably too much) explanation and figures to illustrate the error and the inputs used and outputs obtained.
Aside from this scenario, we made some additional tests to ensure simulations were properly carried out:
1) Tested Zacros-3.01 in two different HPCs and a personal computer to ensure its not a compilation/hardware issue
We get the same error
2) Tried to make the lattice smaller to reduce possible input errors in lattice_input.dat
We get the same error
3) Tried to make an hexagonal lattice to see if the error had something to do with squared connectivity
We get the same error
4) Tried to make the H2 partial pressure zero to see if only CO diffusing (without H) provokes the error
We don't
5) Tried to displace the lattice points a small amount to ensure they do not touch the periodic boundary
We get the same error
6) Tried to make the same squared cell but alternating two kinds of equivalent sites "N0" and "N1" connected exactly as the previous simulation. We copy all energies and reactions from "N0" to "N1" to ensure both sites are indistinguishable but with different names.
We don't get the error anymore.
I'm not sure what may be causing the error, but since our test number 6 is running normally, I believe there could be a bugg somewhere when building the lattice. I'm curious about what did go wrong so I hope this report helps in finding the potential bugg so that we all can learn a bit more about Zacros.
Thank you very much for your time and assistance
Please Log in or Create an account to join the conversation.
- Galonso
- Topic Author
- Offline
- User
Less
More
- Posts: 2
- Thank you received: 0
1 year 4 months ago #4
by Galonso
Replied by Galonso on topic Molecule numbering for lattice site <> and molecule/entity <> is inconsistent.
The report was missing in the previous post and the simulation inputs/outputs were duplicated.
I'm trying to upload the report again and sorry for the inconvenience
I'm trying to upload the report again and sorry for the inconvenience
Please Log in or Create an account to join the conversation.
- Admin
- Offline
- Admin
Less
More
- Posts: 9
- Thank you received: 2
1 year 3 months ago #5
by Admin
Replied by Admin on topic Molecule numbering for lattice site <> and molecule/entity <> is inconsistent.
Sorry for your troubles and thanks for reporting this problem. I just got the chance to dig into it and my conclusion is that it is caused by a "silent" array overflow that corrupts the simulation data leading to failure. This can be demonstrated even in the serial version of the code, if you compile it with the following cmake command, which passes certain Fortran flags to enable array-bound checks, as well as tracing back to the line of the code where this happens:
Then upon execution you get:
Thus, the problem is that a couple of fixed-sized arrays, smallpatternsites and smallpatternstep, get filled-up in your simulation (only the former is mentioned in the above error since it's the one that suffers this fate first during execution). These arrays reside in subroutine add_monodentspecies_related_12siteproc in module kmc_simulation_handle_module, and, as a workaround, you can increase their size from 20 to 80, as shown below (this should be in line 1386 of kmc_simulation_handle_module.F90 in Zacros 3.01):
With the above fix, I was able to run your simulation up to the 10000 steps prescribed in your simulation input file.
We will implement memory amortization (dynamic arrays) for the arrays mentioned above to avoid such issues, since there is no way to know a priori how big they should be. Yet, this feature might not be ready in the next release of Zacros, so, in the meantime please try the above workaround and let us know if this fixes your issue.
Code:
cmake3 ../.. -DCMAKE_BUILD_TYPE=Debug -DCMAKE_Fortran_FLAGS='-g -traceback -O2 -check bounds' -Ddoopenmp=OFF -Ddompi=OFF
Then upon execution you get:
Code:
forrtl: severe (408): fort: (2): Subscript #1 of the array SMALLPATTERNSITES has value 21 which is greater than the upper bound of 20
Image PC Routine Line Source
zacros.x 000000000066810F Unknown Unknown Unknown
zacros.x 000000000047E3B0 kmc_simulation_ha 1456 kmc_simulation_handle_module.F90
zacros.x 00000000004790F2 kmc_simulation_ha 1565 kmc_simulation_handle_module.F90
zacros.x 0000000000462013 kmc_simulation_ha 1270 kmc_simulation_handle_module.F90
zacros.x 00000000006265E5 state_propagator_ 204 state_propagator_module.F90
zacros.x 0000000000407745 MAIN__ 783 zacros_main.F90
zacros.x 0000000000405962 Unknown Unknown Unknown
libc-2.17.so 00007FCC39A52555 __libc_start_main Unknown Unknown
zacros.x 0000000000405869 Unknown Unknown Unknown
Thus, the problem is that a couple of fixed-sized arrays, smallpatternsites and smallpatternstep, get filled-up in your simulation (only the former is mentioned in the above error since it's the one that suffers this fate first during execution). These arrays reside in subroutine add_monodentspecies_related_12siteproc in module kmc_simulation_handle_module, and, as a workaround, you can increase their size from 20 to 80, as shown below (this should be in line 1386 of kmc_simulation_handle_module.F90 in Zacros 3.01):
Code:
integer neighlist(20), nsmallpattern, smallpatternsites(80,2), smallpatternstep(80)
With the above fix, I was able to run your simulation up to the 10000 steps prescribed in your simulation input file.
We will implement memory amortization (dynamic arrays) for the arrays mentioned above to avoid such issues, since there is no way to know a priori how big they should be. Yet, this feature might not be ready in the next release of Zacros, so, in the meantime please try the above workaround and let us know if this fixes your issue.
Please Log in or Create an account to join the conversation.
Time to create page: 0.277 seconds