remove small cudaMemcpys (#175)
This patch removes the many small `cudaMemcpy` calls for single values, except for those from calling `net_receive` in event delivery. The small copies during initialization were from when the upper diagonal and time invariant component of the diagonal were computed on the host. There were many small reads/writes to device memory accessing the `p` and `u` vectors. * Remove many small device copies in matrix setup by copying required data to host, computing, and then copying back in one copy. * Add `constexpr` test `is_debug_mode()` for having been compiled in debug mode (tests `NDEBUG`). * Only perform `is_physical_solution` test if `is_debug_mode()` is true. (The `is_physical_solution` test triggers a single copy from device to host on each time step to test whether the voltage has exceeded some "reasonable" physical bounds.)
Please register or sign in to comment