Move pragma ivdep above omp directive and implement
const correct copy constructors for KokkosArray3
-For Dan S.
22 lines of code changed in 2 files:
Update Array3 operator(i,j,k) to call through to Kokkos if enabled
56 lines of code changed in 4 files:
Temp disable Kokkos in RMCRT/Ray.cc because the implementation has changed
12 lines of code changed in 1 file:
Added the SVN URL to sus's output so we can identify which branch was used during a particular run.
7 lines of code changed in 4 files:
add vectorization hints to parallel_for
6 lines of code changed in 1 file:
Enabled restart tests on the threaded scheduler and GPU tests.
8 lines of code changed in 1 file:
Update one of the GPU compressible tests to output EVERY timestep since GPU checkpoints contain garbage if one doesn't output every timestep.
1 lines of code changed in 1 file:
Add RT coverage for the new implicit solver
16 lines of code changed in 1 file:
Allow initial timestep of restarts to run multi-threaded. This enables GPU-RMCRT restarts for ARCHES.
2 lines of code changed in 1 file:
7 lines of code changed in 1 file:
Added notes on setting up a RT on a machine
78 lines of code changed in 1 file:
add GPU RT coverage for the compressible flow algorithm
15 lines of code changed in 1 file:
set the CUDA_VISIBLE_DEVICES environment variable. Soon this will need
to be machine dependent variable.
2 lines of code changed in 1 file:
This makes RMCRT computions fully asynchronous, giving us substantial speedups in computations with multiple patches. Future work can be looking at reduing the amount of GPU registers per GPU thread, as it appears the GPU is getting filled up faster and it can't hold all concurrent kernels.
328 lines of code changed in 9 files:
Move the MPIRUN path to the front of the line. You need this if the machine is using a non-system mpirun.
4 lines of code changed in 1 file:
Add DOUT - Based on printf, this is a simple, threadsafe alternative to DebugStream. Using this initially to cleanup GPU DataWarehouse debug output.
133 lines of code changed in 1 file:
Streaming kernels in hopes of overlapping computations
2 lines of code changed in 1 file:
Some important fixes
33 lines of code changed in 1 file:
Some cleanups, a few more are coming
25 lines of code changed in 1 file:
This should fix the build errors
85 lines of code changed in 1 file:
This patch mainly targetted getting all RMCRT runs working on the new GPU framework. It *might* break Wasatch in very small and easily fixable ways. This also fixes perhaps a dozen or so subtle but important bugs in the scheduler.
576 lines of code changed in 11 files:
Removed the unnecessary creation of several IntVectors.
11 lines of code changed in 1 file:
Reduced the creation and deletion of hypre objects. This should give a small speed up of the radiation solve at scale. To maintain support for the petsc solver, minor changes to petsc were also made.
181 lines of code changed in 10 files:
Clean up some doxygen warnings.
86 lines of code changed in 18 files:
updated configure line for cyrus.mech.utah.edu
2 lines of code changed in 1 file:
Older version of python crash on the previous commit. Trying a different approach to determine if a environmental var is set.
2 lines of code changed in 1 file:
Newer versions of openmpi won't take -x MALLOC_STATS if the environmental variable is unset.
Added a workaround.
9 lines of code changed in 1 file:
First cut at an implicit solver in Wasatch using Pseudo Transient Continuation (Psi_TC).
The current solver essentially performs a fixed point BDF1 in physical time with a forward Euler on the pseudo space.
The solver support the majority of current Wasatch capabilities (ODEs, Scalar Transport, and Compressible flows) along with boundary conditions.
The Wasatch Psi_TC implementation uses a subscheduler to manage the iterative solution.
To trigger Psi_TC simply add the following XML block to any wasatch input file
<DualTime iterations="1000" tolerance="1e-7" ds="0.01"/>
That's all!
Tests and examples will be added soon.
535 lines of code changed in 10 files:
This commit removes the last remnant of the growing deposition model. No impact on regression tests.
1 lines of code changed in 1 file:
Revert last update to sub.mk... Sorry Todd, it appears that in CUDA builds, that
NVCC spaghettifies the linking making everything require everything else (with
respect to cuda symbols). There might be a way to untangle it so that we could
build compare_uda (and the other utils) without requiring it to link to components
but that will have to wait for another day. -Dav
18 lines of code changed in 1 file:
Need to explicitly use bash, as some computers have sh pointing to dash.
1 lines of code changed in 1 file:
Compare_uda and other utilities do not need to link against components
27 lines of code changed in 2 files:
Fix the ability to make just Arches (--enable-arches instead of having to --enable-all-components).
M configure
M configure.ac
- Force the use of CMake 3.1+ if building Wasatch or VisIt).
- Add in RadProps info to W3P check.
M CCA/Components/Wasatch/Transport/sub.mk
M CCA/Components/Wasatch/Expressions/sub.mk
M CCA/Components/Wasatch/Operators/sub.mk
M CCA/Components/Arches/CoalModels/sub.mk
M CCA/Components/Arches/sub.mk
- Remove superfluous white space.
- Alphabetize.
- Fix comment typo (cut/paste error).
- Untabify (as appropriate).
M CCA/Components/Wasatch/sub.mk
- Same as above and:
- Fix the problem where CUDA_ENABLED_SRCS was not set if just building Arches.
- A number of files (eg: BCHelper.cc) were listed twice in the list of sources.
M Core/Grid/UnknownVariable.h
- Remove warning (Exception is now in the Uintah namespace so don't need to qualify it).
48 lines of code changed in 8 files:
Migrated all of the recent single level CPU changes over to the GPU code.
* This will change the answers slightly
257 lines of code changed in 4 files:
* Clean up some dummy labels that did nothing (dummySolve, anyone?)
* Add a standalone function to ensure BCs are set for all independent
* variables. This is run once at simulation startup.
* Add a RestartInitialize for property models. This is needed for models
* such as heat loss that are using a handoff file to ensure the handoff
* file is properly loaded.
* Make sure that the handoff file information is loaded for transport
* equations before calling getState
This is needed to ensure that the initial condition is properly set at
the handoff file locations.
572 lines of code changed in 23 files:
Major implementation change for all 1L routines. The first in several big commits.
The ray location and all intermediate calculations are computed in physical units
NOT the cell's (ijk) + some fraction of a cell. This simplifies debugging and when rays
move between levels.
- Added addition debugging output when enabled.
- Added #define FIXED_RAY_DIR to fix the ray's direction when enabled
- removed unused global variables
- simplified calculation of ray location on a cell face. KISS
- removed DyDx and DzDx from all 1L routines. We automaticaly account for non-cubic cells with the
new implementation.
- The also fixes a bug where there was a negative optical thickness when scattering.
** This will change the answers in all single level RMCRT tests by fuzz.
264 lines of code changed in 7 files:
Source the csafe-tester's .cshrc file before running configure command. I had that disabled for a reason and now I can't remember.....
1 lines of code changed in 1 file:
uncomment line to import check_output.
You need this if the user is using a non-system mpirun.
1 lines of code changed in 1 file:
Turned off nano pillar. It's failing and Jim is out of town.
1 lines of code changed in 1 file:
Eliminate conflict between Wasatch's Extrapolate operator and SpatialOps' Extrapolant operator. We should consolidate these if possible...
0 lines of code changed in 8 files:
Cleaning up warnings.
282 lines of code changed in 8 files:
More info for error.
1 lines of code changed in 1 file:
Put diffusivity inside of conditional for diffusion calcs in the particle
splitting code. Fix uninitialized variable inside of RFElasticPlastic.
12 lines of code changed in 2 files:
removed EOL spaces.
Changed path of tmp files from local directory to /tmp/.
Changes so you can parse a MasterUda/index file with relative paths.
0 lines of code changed in 2 files:
This commit should fix the previous commit with respect to Uintah Variable
types. Additionally, it adds support for Arches files to be pushed through
NVCC as needed.
M CCA/Components/Wasatch/sub.mk
- Added a brief comment on use of CUDA_ENABLED_SRCS.
M CCA/Components/Arches/TransportEqns/CQMOM_Convection.h
M CCA/Components/Arches/TransportEqns/Discretization_new.h
M CCA/Components/Arches/TransportEqns/CQMOMEqn.h
M CCA/Components/Arches/TransportEqns/DQMOMEqn.h
M CCA/Components/Arches/Task/TaskInterface.h
- Minor cleanups:
- Fix warning message.
- Some white space for readability.
- Remove an empty "private:" block.
M CCA/Components/Arches/CoalModels/sub.mk
M CCA/Components/Arches/TransportEqns/sub.mk
M CCA/Components/Arches/Transport/sub.mk
M CCA/Components/Arches/sub.mk
M CCA/Components/Arches/ParticleModels/sub.mk
M CCA/Components/Arches/PropertyModels/sub.mk
M CCA/Components/Arches/ChemMix/sub.mk
M CCA/Components/Arches/PropertyModelsV2/sub.mk
M CCA/Components/Arches/Task/sub.mk
M CCA/Components/Arches/SourceTerms/sub.mk
M CCA/Components/Arches/Operators/sub.mk
M CCA/Components/Arches/WallHTModels/sub.mk
M CCA/Components/Arches/Utility/sub.mk
M CCA/Components/Arches/LagrangianParticles/sub.mk
- Update Arches sub.mk files to correctly specify which files
need CUDA (nvcc) compilation.
M Core/Grid/Variables/CCVariable.h
M Core/Grid/Variables/SFCYVariable.h
M Core/Grid/Variables/NCVariable.h
M Core/Grid/Variables/SFCXVariable.h
M Core/Grid/Variables/SFCZVariable.h
M Core/Grid/Variables/ParticleVariable.h
M Core/Disclosure/TypeDescription.h
- Turns out that the registerMe variable was needed. There was an erroneous comment
in the previous version that implied that it wasn't needed (that the work happened
in the Variable constructor, but this was not the case. I have added a number of
comments to clarify how this works so the next person to dig into it will have an
easier time.
M Core/Disclosure/TypeDescription.cc
- Update file global variables to be designated as such to help avoid confusion.
- Assign to NULL as they are pointers and not integers.
1040 lines of code changed in 35 files:
This will make the deposition model function under steady-state assumptions. Won't impact regression tests.
15 lines of code changed in 4 files:
Add pDiffusivity to the particle splitting so that my RT will actually run.
10 lines of code changed in 1 file:
add in-situ var reference and added comments
44 lines of code changed in 2 files:
Initialize intensities in extra cells for discrete-ordinates for viz purposes.
4 lines of code changed in 1 file:
Remove more unused 'Register' stuff. Commit update to aclocal.m4 that is in configure already, but still not sure why these changes (from John or Tony) are necessary.
0 lines of code changed in 3 files:
Looks like these changes are needed... Still waiting on John/Tony to comment on what the reason is for this.
1 lines of code changed in 1 file:
updates for Ash
3 lines of code changed in 1 file:
A number of fixes for Titan (CUDA) and static builds in general. Also, with this commit, the stand-alone
tools such as puda and compare_uda should now work on machines that require static builds.
M configure
M configure.ac
- Turn back on the C++11 check.
- Fix check for broken exceptions (on Titan) that fails because the code is cross-compiled.
- When MPI is specified as built-in, the NVCC compiler fails to find mpi.h (as we don't specify
the -I/path/to/mpi/ flag). This commit uses the built-in compiler to find the location of mpi, and
provides the location to the NVCC compiler (using INC_MPI_H_NVCC).
- Remove some old debugging statements.
- Fix (hack) on Titan for problem with CUDA 7.0 and Boost. Must specify two -D flags.
M configVars.mk.in
- Add the INC_MPI_H_NVCC flag to the NVCC_CXXFLAGS var.
- Fix (copy/paste?) bug where cuda .d files were being deleted as thus make dependency information
was lost. I'm guessing that before this fix, Uintah CUDA developers must have experienced
strange behavior when modifying .h files as the code would not have re-built like it should have.
- Remove all .d files from the base directory... This is due to the fact that Titan's NVCC compiler
is leaving around many bogus/tmp .d files and not cleaning up after itself. It might be that
we should make a check for being on Titan (or a more general test for problematic NVCC .d
generation - though I'm not sure right off on how to write this check) - but for now the removing
of all .d files in the base dir should have no effect on anything else.
- Make verbose output not suppress the "rm" call.
M CCA/Components/Wasatch/Wasatch.h
M CCA/Components/Wasatch/Wasatch.cc
- Fix compiler warning about "const" return variables.
- White space.
M CCA/Components/ProblemSpecification/ProblemSpecReader.cc
- Allow for validation of .ups file on machines where the executable is not in build tree (eg: on Titan,
before running sus, you must move it to a file system visible to the compute nodes). In order
to use this functionality, you must copy the "inputs/" directory to the same location you move sus to.
- It appears that, when validating the .ups file, sus was doing the validation on every process. This
doesn't seem like it is necessary, so only do so now on proc 0...
M Core/Exceptions/ErrnoException.cc
- White space, {}s.
M Core/Grid/Variables/CCVariable.h
M Core/Grid/Variables/SFCYVariable.h
M Core/Grid/Variables/NCVariable.h
M Core/Grid/Variables/SFCXVariable.h
M Core/Grid/Variables/SFCZVariable.h
- Remove the "registerMe" code that is not used. Perhaps it was replaced with the current method of
registering Uintah variables but not cleaned up? Regardless, it doesn't do anything now, and
just makes tracking what is going on more difficult, so it is best that it be removed.
- Put one-line functions in a .h file on a single line to make it easier to view the class spec.
M Core/Grid/Variables/sub.mk
- Add the StaticInstantiate.cc file. This file is used to force Uintah variable registration on
machines that use static builds and don't register the variables in the normal/correct way
when the shared library (static constructors fire) is loaded.
A Core/Grid/Variables/StaticInstantiate.h
A Core/Grid/Variables/StaticInstantiate.cc
- Uintah relies on our variable types (CCVariable, NCVariable, etc) being registered with Uintah
before it allows components to create a variable of the given type. On most systems (and
specifically for shared lib builds), when the CCVariable.o code is loaded from libCore_Grid.so
static constructors fire and register the variables. However, on some machines that use static
builds (such as Titan) appear to "optimize out" these constructors, and thus the variables
never register themselves and when puda or compare_uda (etc) try to load data, the Uintah
type system says it does not recognize the type and dies. This commit adds the (hack) function
instantiateVariableTypes() which sole purpose is to create variables of all types that Uintah
uses so that they will register themselves with the type system. This function is called the
first time a DataArchive is created (on static builds). All the variables it creates go away
as soon as the function is done, but have by then registered themselves.
M Core/DataArchive/DataArchive.h
M Core/DataArchive/DataArchive.cc
- Use the instantiateVariableTypes() function the 1st time a DataArchive is created (on static builds).
- Add a few more 'const's for variables that must not change.
- Indent properly, white space.
- Move some private data to the main private section.
M Core/Disclosure/TypeUtils.cc
- Move to 80+ columns for better readability.
M Core/Disclosure/TypeDescription.cc
- Remove the (non-used) Register section.
527 lines of code changed in 19 files:
Putting the nano-pillar test back in.
1 lines of code changed in 1 file:
remove get3DPointer from Array3Data
53 lines of code changed in 7 files:
Missed this detail in previous commit (55185).
8 lines of code changed in 1 file:
Added ash density specification for depositon model.
5 lines of code changed in 2 files:
Remove a BC check that was occuring every timestep (thanks Derek for
pointing this out.) The check is occurring on startup and on restart
now only once.
0 lines of code changed in 1 file:
This commit makes the hybrid momentum discretization approach upwind wall cells by default. No impact on regression tests.
0 lines of code changed in 6 files:
Added non-performance nightly RT test for OFC4.
1 lines of code changed in 1 file:
More potential improvements to the Hypre solver interface. Apparently, there is NO need to reassemble the coefficient matrix, the solution, and the RHS vectors.
According to Hypre timings, this results in a 4-6X speedup on the MATRIX+VECTOR assembly timings (ONLY).
Those may be just a tiny fraction of the overall time spent in Hypre at small scale < 4K cores.
Their effects, however, remains to be seen at large scale as well as with the Arches DO radiation solver.
5 lines of code changed in 4 files:
remove #if 0 from the HypreSolver interface
3 lines of code changed in 1 file:
Removed diffusion from upwind coefficients, as Phil suggested.
18 lines of code changed in 3 files:
remove misleading code that doesn't work
0 lines of code changed in 1 file:
add a few more notes to describe the usage of the new Hypre timers
6 lines of code changed in 1 file:
fix, cleanup, and improve hypre timings in the Hypre solver.
The time reported by sus for Hypre is based on times measure on Rank 0 only.
While it very likely that the Hypre timings for all ranks are similar, we should also be able to measure the maximum time taken across all ranks.
To turn this feature on, simply uncomment #define HYPRE_TIMING in HypreSolver.h and rebuild Uintah.
In the absence of this flag, there is NO COST associated with have these hypre timing functions since they will turn to empty #define macros (see hypre/src/_hypre_utilities.h)
92 lines of code changed in 2 files:
- Make the computeStableTimestep part of the solver abstraction.
- Add a KokkosSolver
525 lines of code changed in 8 files:
Functorized the new momentum source terms. I still need to add all of the kokkos stuff.
30 lines of code changed in 2 files:
create build dirs, instead of building in source dirs
0 lines of code changed in 4 files:
Added missing momentum source term from char oxidation.
313 lines of code changed in 4 files:
One additional improvement to the Hypre solver interface:
Do NOT allocate/destroy grid every timestep unless necessary (timestep = 1, setupfrequency, restart).
Please let me know if you notice any speedups at large scale (32K+ cores).
62 lines of code changed in 2 files:
Added missing momentum source term from devolatilization.
312 lines of code changed in 4 files:
Remove a debug print statement.
0 lines of code changed in 2 files:
* Initial commit of a Kokkos-based implementation of single-level RMCRT.
This code utilizes a functor-based approach to implement the solveDivQ cellIterator loop with Kokkos.
566 lines of code changed in 1 file:
Added hybrid discretization scheme for momentum equations.
958 lines of code changed in 17 files:
updated to outputProblemSpec to output diff_curve info
17 lines of code changed in 1 file:
Added diff_curve input to the NonLinearDiff1 model.
164 lines of code changed in 5 files: