GithubHelp home page GithubHelp logo

hypermine's Issues

Chunk LoD

As of #63, we're GPU bound on weak hardware. Using simplified geometry for distant chunks could help alleviate this. A very simple approach would be to run marching cubes on entire chunks, such that each chunk is a single cell, containing only a few triangles. Challenges include:

  • Preventing geometric gaps between LoD levels
  • Addressing the discontinuity between boxy and smooth terrain (if we retain boxy voxels)
  • Texturing the topologically very distinct geometry (if we retain boxy voxels)

More precise terrain-gen short circuiting

We currently use a rough empirically-derived heuristic to determine when we can skip terrain gen for a chunk due to it necessarily not containing a surface. We should be able to make this more precise by considering the elevation at multiple points in the chunk (e.g. each corner) and computing the exact maximum amount by which the maximum/minimum elevation in the chunk can differ from the maximum/minimum of those points.

Dynamic lighting via clustered forward rendering

This should give us reasonably cheap and scalable small dynamic lights, which suits the hyperbolic space well since large lights would have to be unreasonably powerful. Will provide a foundation for iterating towards a more visually interesting aesthetic.

Consider reducing draw calls in voxel rendering

We currently make one draw call per voxel chunk. When in the middle of the right sort of valley, this can add up to several thousand draw calls, which may waste significant CPU time. If profiling confirms this, multi-draw-indirect could be used to reduce to a single call, at the cost of maintaining an indirect buffer identifying the surfaces to draw each frame.

Release host memory for distant voxel chunks

Although we already carefully manage video memory, we currently store the original voxel data in host memory for the duration of the process. To limit memory use when exploring large areas, we should free distant chunks, and regenerate them in the rare event that they're needed again.

A straightforward approach would be storing actual voxel data in a new LRU table, indexed into by the graph.

Populate chunk margins

Chunk data sent to the GPU is wrapped in a one-voxel margin that should be, but is not currently, populated with the data from neighboring chunks. If supplied, this data would produce correct AO and hidden face detection for voxels on the face of a chunk, improving both visuals and performance.

There are two exceptions:

  • Chunks which are otherwise void do not benefit from having margins set, as they have no faces to hide or occlude. Setting a margin might also interfere with efficient sparse representation of such chunks.
  • Chunk faces which face graph nodes that don't yet exist, i.e. outside of the graph, cannot be populated, as their chunks are sensitive to parameters that have not yet been computed. Fortunately, we can make up whatever values we like for them: a camera within the graph cannot possibly observe any potentially exposed voxel faces along such a chunk face, and ambient occlusion errors will be difficult to see, especially in the presence of distance fog.

Orient camera perpendicular to gravity

Clients should automatically send inputs that will orient their character consistently upright. We shouldn't force this on the server side, as that needlessly reduces flexibility.

Populate chunks on-demand

We're currently generating voxels for chunks the instant that we have sufficient information to do so. This works fine in a brand new world, but leads significant hangs when connecting to a server where significant exploration has already happened. We could prevent this by only generating voxels when a chunk first enters the view distance.

Label dodeca sides

I'm trying to mess around with the code a little bit, and trying to implement my own inter-chunk generation similar to roads, and I'm noticing a pretty crucial part of it is trying to figure out which side of the dodeca is up, which sides correspond to horizontal, and which sides correspond to down. (please note I'm looking at this code from an outside perspective, so if this is a dumb question, please forgive me. )

looking on worldgen.rs, adjacent chunks of the road object are labeled based on the dodeca.rs class. I want to try my hand at building a tree, so I want to know how to extend an object similar to road into 2 dimensions. The issue is I don't know how the sides in dodeca correspond to directions in the actual world.

Crashes entire OS

I started up Hypermine for the first time and looked around a bit. After just a couple minutes both my screens stopped receiving output from my video card. The computer didn't appear to shut down; I had to do a hard shutdown and restart. I don't see any log files so I'm not sure how to give more information.

I'm going to try leaving some audio playing in the background so that I at least know whether that gets stopped by the crash.

Verify server certificates

This is necessary to prevent trivial man-in-the-middle attacks on clients. However, care is necessary to ensure servers can be easily hosted on a LAN and from servers that lack a domain name. Ideally we'll likely want a combination of paradigms:

  • Traditional PKI for servers with real domain names
  • Trust-on-first-use for WAN IP addresses
  • Something else for LAN IP addresses, since TOFU is likely to be too onerous on addresses that will be frequently reassigned. Maybe disable verification and display a fingerprint that can be manually verified if desired?

Improve numerical stability of ground plane distance estimation

At sufficiently high elevations with respect to the ground plane, glitchy floating islands appear. This is likely due to exponential growth of the magnitude of the normal vector encoding the distance to the ground plane. We could mitigate the representation problem by throwing additional precision at the problem in the form of an explicit exponent: rather than the vector v, store the pair (v/(2^c), c). Further investigation (@MagmaMcFry?) is necessary to determine the details of computing distance/direction of the plane from this representation without running into the same issue again.

[mvk-error] VK_ERROR_INITIALIZATION_FAILED: vkCreateMacOSSurfaceMVK(): On-screen rendering requires a layer of type CAMetalLayer

When I run the client I receive the error [mvk-error] VK_ERROR_INITIALIZATION_FAILED: vkCreateMacOSSurfaceMVK(): On-screen rendering requires a layer of type CAMetalLayer

I am running macOS 10.15.3

Full stack trace is:

$ cargo run --bin client
    Finished dev [unoptimized + debuginfo] target(s) in 0.16s
     Running `target/debug/client`
[mvk-error] VK_ERROR_INITIALIZATION_FAILED: vkCreateMacOSSurfaceMVK(): On-screen rendering requires a layer of type CAMetalLayer.
Apr 12 12:53:02.559 ERROR vulkan: VK_ERROR_INITIALIZATION_FAILED: vkCreateMacOSSurfaceMVK(): On-screen rendering requires a layer of type CAMetalLayer. id= number=0 queue_labels= cmd_labels= objects=SURFACE_KHR 7fdf786d9550
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: ERROR_INITIALIZATION_FAILED', client/src/graphics/window.rs:57:13
stack backtrace:
   0: std::sys_common::at_exit_imp::push
   1: core::fmt::ArgumentV1::show_usize
   2: std::io::Write::write_fmt
   3: std::panicking::default_hook::{{closure}}
   4: std::panicking::default_hook
   5: <std::panicking::begin_panic::PanicPayload<A> as core::panic::BoxMeUp>::get
   6: std::panicking::try::do_call
   7: std::panicking::begin_panic
   8: std::panicking::begin_panic
   9: core::result::Result<T,E>::unwrap
             at /rustc/b8cedc00407a4c56a3bda1ed605c6fc166655447/src/libcore/result.rs:963
  10: client::graphics::window::Window::new
             at client/src/graphics/window.rs:57
  11: client::main
             at client/src/main.rs:68
  12: std::rt::lang_start::{{closure}}
             at /rustc/b8cedc00407a4c56a3bda1ed605c6fc166655447/src/libstd/rt.rs:67
  13: std::panicking::try::do_call
  14: panic_unwind::dwarf::eh::read_encoded_pointer
  15: <std::panicking::begin_panic::PanicPayload<A> as core::panic::BoxMeUp>::get
  16: std::rt::lang_start
             at /rustc/b8cedc00407a4c56a3bda1ed605c6fc166655447/src/libstd/rt.rs:67
  17: client::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Before I got this error, I was receiving LibraryLoadError("dlopen(libvulkan.dylib, 1): image not found") which was resolved by installing the latest Vulkan SDK (1.2.135.0) from https://vulkan.lunarg.com/sdk/home. So I think it's possible I'm using the wrong version of Vulkan.

Configurable scale

We currently work mostly in absolute units with respect to the hyperbolic space. However, it's more convenient to reference the size of a voxel for scale. Movement speed, player collision geometry, and asset size should all be defined in these terms. For ease of experimentation, we should also allow the scale of voxels themselves to be configured, decoupling them from those other factors.

Short-circuit terrain gen for empty/solid chunks

We're spending a lot of CPU time deciding that every single voxel of every single empty chunk is indeed empty. We should be able to use the values of the max elevation field at the vertices of a chunk, combined with knowledge of the shape of the distance field to the ground plane, to efficiently determine that a chunk has no/exclusively solid voxels and skip it.

Care will be necessary when optimizing solid chunks, as they must only be short-circuited if their margins are also solid. However, we can ignore (i.e. assume solid) margins facing nodes that have not yet been constructed, because a chunk cannot possibly be observed from that direction; such faces would always be backface-culled by a camera located within the graph.

python3 not recognized as "python"

With my new computer I had been getting an error in the build process that "python" was not a recognized command

thread 'main' panicked at '

 couldn't find required command: "python"

  '

I was able to fix it by installing the package python-is-python3, but it would be incredibly helpful for other users if they didn't need to do so, or if the troubleshooting message was more informative.

Generate large-scale structures

One of the best ways to exhibit hyperbolic geometry is with structures that are impossible in euclidean space. We should extend worldgen to produce a variety of such structures, e.g. yendorian trees, whose branches don't decrease in size compared to their parent branch. These structures are characterized by being substantially larger than a graph node; think large buildings, cave systems, and megastructures. They don't necessarily have to be constructed from voxels.

The main challenge here is generating such features deterministically no matter how you approach them, and in defining complex, interesting hyperbolic geometry in a maintainable way.

Generate terrain in a thread pool

We currently generate chunks on the rendering thread, leading to hitching. This is an embarassingly parallel and asynchronous problem perfectly suited to a threadpool.

Optimize chunk transform streaming

Currently, for each frame, for each chunk, we invoke vkCmdUpdateBuffer with the transform from that chunk to the local node. In a valley, this can add up to hundreds of kilobytes. This is a bit of an abuse of vkCmdUpdateBuffer and may explain the large CPU time spent preparing to render chunks. There are a number of improvements to be made:

  • Use a staging mapped buffer and transfer command. This should mitigate driver overhead, and may improve performance substantially all on its own.
  • Because the underlying honeycomb is regular, we can drastically reduce the amount of bandwidth used by storing a precomputed table of transforms to the origin node from the chunks surrounding the origin node out to the maximum view distance, and maintaining a buffer of indices mapping the neighborhood of the player to analogous chunks surrounding the origin. This buffer is 1/32 the size of the current transform buffer, and would need to be rewritten every time the player moves between nodes, but small incremental writes could be used otherwise. This also saves us from doing a bunch of matrix multiplication as we traverse the graph, which might improve traversal performance significantly (currently 2-4ms/frame).
  • As of #53, chunk transform information (of whatever nature) can be passed through an instance buffer rather than looked up in a storage buffer, simplifying and perhaps slightly optimizing the vertex shader.

Generate clutter

The terrain currently has diverse materials and decent hills/valleys, but looks very dead. We should liven it with plant life, rocks, and whatever other small features occur to us. Clutter could be static meshes, voxel structures, or both, but meshes will be much easier as chunk boundaries can be mostly ignored. Placement should be randomized and sensitive to environment factors.

Graceful shutdown

Exiting a client or server (e.g. due to SIGINT or window close button pressed) should cause the network connection to be gracefully closed before the process terminates, allowing the peer to clean up resources in a timely fashion and provide better feedback to other users.

  • Client
  • Server

Node wireframes

To help with extending and debugging the state machine, it will be helpful to have the option to render wireframes or markers indicating nodes, edges between nodes (labeled/colored by face type and directed to the shorter node, with parent edges marked specially), chunk boundaries, and the order of axes in the chunk coordinate system.

Walking

Looking around is already cool, but even more cool would being able to walk on this terrain.

Depends on #34 I guess.

Optimize spatial queries used by chunk/entity rendering

To compute the set of chunks/entities that need to be rendered, we traverse the portion of the graph overlapping with a sphere around the viewpoint. This is empirically a major CPU time sink. Available optimizations include:

  • Avoid redundant per-frame traversals
  • Skip portions of the graph outside the view frustum
  • Precompute node-to-node transform matrices and edges to traverse using a secondary graph. See also remaining work in #55.

See also #73 for refactoring of related code.

More accurate view distance culling

Currently, Graph::nearby_nodes only considers node centers. This causes chunks which fall within view distance, but which lie in nodes that do not, not to be rendered. While an exact solution requires potentially nontrivial hyperbolic geometry predicates, we could achieve a significantly closer approximation by checking distance to each vertex of the node and rendering chunks which lie on vertices that fall within view distance even if the center of their node does not. Alternatively, chunk bounding spheres could be used as discussed in #100.

Investigate/optimize surface extraction performance

When significant numbers of chunks are in view, the game currently suffers from severe pop-in. This is mainly because we default to loading at most 16 chunks per frame, based the amount of surface extraction an Intel GPU seems to be able to handle with mostly adequate performance. We need to do better. Avenues to explore include:

  • Improve pipelining: currently we extract and then render on the same frame, when we could overlap with the next frame's CPU work instead
  • Improve parallelism: each chunk's worth of work has its own set of fine-grained barriers synchronizing it. Rumour has it that drivers tend to implement fine-grained barriers as global barriers, so this may be causing chunks to be processed one-at-a-time.
  • Profile the compute shaders. There may be optimization opportunities.
  • Investigate performance on more powerful hardware. If this is only a problem on intel, that's not the end of the world.
  • Use async compute for even better pipelining on capable hardware (not Intel)
  • Move more work onto the CPU. At present, terrain generation consumes a vanishingly small amount of CPU.

Dynamic GPU memory allocation for extracted surfaces

We currently allocate a hardcoded number (4096) of worst-case sized slots for extracted surfaces. This makes inefficient use of memory, putting us uncomfortably close to storage buffer range limits, and gets much worse as chunk size increases. If we hand off metadata to the CPU after surface extraction, we could maintain only a small temporary buffer for worst-case surface extraction results, and more efficiently pack the main buffer using an allocator data structure on the CPU. This would also simplify #51 as the CPU could track the number of vertices in each surface, allowing it to construct indirect draw commands itself.

More accurate view distance based graph expansion

After moving each character, Graph::ensure_nearby is used to expand the graph to contain all nodes whose centers lie within the view distance, and all neighbors of those nodes. Because chunks cannot be populated with terrain until the cell of the cubic honeycomb containing them has all incident nodes generated, this leads to characters being within view distance of chunks that lie within generated nodes but which cannot be populated as they have incident nodes which have not yet been generated.

To correct this without expanding the graph at an unsustainable rate, we need to use a chunk-aware traversal that populates nodes if they are incident to a chunk which is within view distance. To avoid requiring a hyperbolic cube vs. sphere intersection test, we can approximate chunk visibility by considering node vertex visibility or using bounding spheres, as in #49.

Abstract out environment factor propagation

Environment factors rely in somewhat subtle arithmetic to construct scalar fields on the graph with symmetric distributions, i.e. such that the origin cannot be identified. We shouldn't be separately hardcoding that math for every single environment factor.

Configurable curvature

Could take the form of specifying the length of the fundamental unit in meters, having the effect of scaling the underlying graph with respect to everything else. Voxels per cube would need to be adjusted as well to maintain a constant voxel size.

Decouple input timing sensitivity from rendering

We currently only collect input between frames. This makes the precision of control over movement (for example) proportional to framerate. We could improve on this by relying on precise input timestamps. Ideally we'd extend winit to collect these from the OS, but a more tractible approach would be to punt rendering/simulation onto a separate thread from input, record our own timestamps in the input thread, and stream timestamped events to the simulation.

' couldn't find required command: "cmake" '

C:\Users\<USER>\hypermine>cargo run --bin client --release
   Compiling shaderc-sys v0.6.2
The following warnings were emitted during compilation:

warning: System installed library not found.  Falling back to build from source

error: failed to run custom build command for `shaderc-sys v0.6.2`

Caused by:
  process didn't exit successfully: `C:\Users\<USER>\hypermine\target\release\build\shaderc-sys-cafcec58b5ca3411\build-script-build` (exit code: 101)
  --- stdout
  cargo:warning=System installed library not found.  Falling back to build from source

  --- stderr
  thread 'main' panicked at '

  couldn't find required command: "cmake"

  ', C:\Users\<USER>\.cargo\registry\src\github.com-1ecc6299db9ec823\shaderc-sys-0.6.2\build\cmd_finder.rs:50:13
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

I installed CMake and put it in the PATH so why is this?

image

Chunk-granularity frustum culling

As of #99, voxel rendering skips processing nodes that lie fully outside the frustum. For nodes that lie on the boundary, we could in principle do better by testing each chunk individually. The bounding spheres for those tests should be centered between the origin of the node and a the chunk vertex, and have half the node's bounding sphere's radius.

Care should be taken to ensure that the extra CPU time to perform this check is justified by the GPU time savings.

Recovery memory from unneeded sections of the graph

The core Graph data structure that models world space presently grows without bound as players explore. This leads to unbounded memory use over time, giving instances of the world an effectively finite lifetime before they become unplayable.

To some extent this is inherent in the genre: a Minecraft world, continuously explored, will eventually fill your disk. However, hyperbolic space amplifies the problem: for a given view distance, a far larger volume is visible from a single viewpoint, to say nothing of the volume visible in the course of traveling a certain distance. Graph size is linearly proportional to these figures.

Exact graph memory use as a function of distance traveled by players should be analyzed to quantify the magnitude of the problem. If the memory growth is indeed significant enough to threaten reasonable playtimes, we should take measures to free unused sections of the graph.

Graph nodes must not be freed if they encode data that cannot be regenerated on demand. Nodes that must be preserved include those that contain persistent entities, those that contain chunks that have been persistently modified, and those that the graph considers parents of persistent nodes.

  • In each node, maintain a reference count describing the number of persistent references that require it.
  • When a persistent reference is introduced, increment the associated node's counter. If the counter was previously zero, this constitutes introducing another persistent reference to the node's parent, and the algorithm recurses.
  • When a persistent reference is removed, decrement the associated node's counter. If the counter becomes zero, this constitutes removing a persistent reference from the node's parent, and the algorithm recurses. Nodes that have zero persistent references are eligible to be freed.
  • When a persistent reference is moved between nodes, special care should be taken if the original node's counter would become zero and the target node's counter is initially zero. Execute both the removal and the introduction simultaneously, always recursing on the node furthest from the origin. If the recursions reach a common ancestor, terminate them both without updating the ancestor's counter. This ensures that e.g. an entity moving between neighboring nodes requires only constant work.

Server can transmit the same graph nodes twice

When a client first connects, the graph is synchronized to them in full. Each timestep, newly introduced graph nodes from that step are broadcast to all clients. These sets can overlap, leading to the same nodes being added to the graph twice, which produces invalid topology and leads to panics when traversing the graph.

One solution could be to replace the first incremental update with a full sync directly, rather than having an independent initial sync.

Segfaults when compiling

kimapr@DComp ~$ neofetch
 ..                             `.   kimapr@DComp 
 `--..```..`           `..```..--`   ------------ 
   .-:///-:::.       `-:::///:-.     OS: Guix System 78d28ffc84c7f5dbd2555783ed 
      ````.:::`     `:::.````        Host: HP 15 Notebook PC 098F12000000000000 
           -//:`    -::-             Kernel: 5.4.49-gnu 
            ://:   -::-              Uptime: 11 hours, 55 mins 
            `///- .:::`              Packages: 55 (guix-system), 113 (guix-user 
             -+++-:::.               Shell: bash 5.0.16 
              :+/:::-                Resolution: 1280x720, 1280x720 
              `-....`                DE: awesome 
                                     Theme: Raleigh [GTK3] 
                                     Icons: gnome [GTK3] 
                                     Terminal: xfce4-terminal 
                                     Terminal Font: Nimbus Mono L Bold 12 
                                     CPU: Intel i5-5200U (4) @ 2.700GHz 
                                     GPU: NVIDIA GeForce 610M/710M/810M/820M /  
                                     GPU: Intel HD Graphics 5500 
                                     Memory: 3511MiB / 3853MiB 
kimapr@DComp ~/projects/hypermine$ catchsegv cargo build --bin client --release
warning: unused config key `http.cafile` in `/home/kimapr/.cargo/config`
   Compiling proc-macro2 v1.0.18
   Compiling libc v0.2.72
   Compiling syn v1.0.34
   Compiling libm v0.2.1
error: failed to run custom build command for `proc-macro2 v1.0.18`

Caused by:
  process didn't exit successfully: `/home/kimapr/projects/hypermine/target/release/build/proc-macro2-f706e0384b5e14fb/build-script-build` (signal: 11, SIGSEGV: invalid memory reference)
warning: build failed, waiting for other jobs to finish...
error: failed to run custom build command for `syn v1.0.34`

Caused by:
  process didn't exit successfully: `/home/kimapr/projects/hypermine/target/release/build/syn-ecf41d81c28f6182/build-script-build` (signal: 11, SIGSEGV: invalid memory reference)
warning: build failed, waiting for other jobs to finish...
error: failed to run custom build command for `libm v0.2.1`

Caused by:
  process didn't exit successfully: `/home/kimapr/projects/hypermine/target/release/build/libm-7ba3ac4581399228/build-script-build` (signal: 11, SIGSEGV: invalid memory reference)
warning: build failed, waiting for other jobs to finish...
error: failed to run custom build command for `libc v0.2.72`

Caused by:
  process didn't exit successfully: `/home/kimapr/projects/hypermine/target/release/build/libc-d0d3700ef5fb9c16/build-script-build` (signal: 11, SIGSEGV: invalid memory reference)

Distance fog

A simple exponential depth fog should be stapled onto the end of the rendering pipeline, exactly strong enough to hide voxel chunk pop-in at the configured view distance.

Frustum culling

We can massively reduce the amount of vertex processing done on the GPU with a relatively simple frustum culling check per terrain chunk.

Distance fog changes colour based on height relative to ground plane

It would be helpful when navigating to have some visual feedback about how deep or how high you are on a large scale. Equidistant surfaces tend to look equally curved beyond a certain size and it is difficult to find your way back to planar ground when you're on an equidistant.

Indexed chunk rendering

We could potentially reduce vertex workload by 1/3 via the post-transform cache by supplying a simple static index buffer for a chunk's worth of faces. The index buffer is the same for all chunks because each chunk is a series of quads.

Best to pursue this after establishing a rendering benchmark, so any actual performance impact can be measured.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.