How do I handle very large loops?

:information_source: Attention Topic was automatically imported from the old Question2Answer platform.
:bust_in_silhouette: Asked By batmanasb
:warning: Old Version Published before Godot 3 was released.

I’m trying to generate terrain using this cellular automata tutorial and need pretty large double for loops of at least 1000 x 1000 or preferably much higher. 10k x 10k would be perfect. However, as I found out the hard way at Ludum Dare 34, large loops stall the game for way too long and often cause crashes on Windows exports. How can I improve/optimize my loop speeds?

The process requires a large double for loop for generating a 2D array, and then I have to generate a new one for every step. Here’s my code so far:

#map config
var mapSize = 1000
var birthLimit = 4
var deathLimit = 3
var initialChance = 0.4
var numSteps = 4

func _ready():
    var map = generateMap():

#generate map over several steps (generations)
func generateMap():
	var map = initializeMap()
	#run the simulation for a set number of steps (generations)
	for i in range(numSteps):
		map = doSimulationStep(map)
	return map


#make a map where every cell has the same chance to spawn alive
func initializeMap():
	var map = []
	for x in range(mapSize):
		map.append([])
		for y in range(mapSize):
			map[x].append(randf()<initialChance)
	return map


#map a new map based on the old map
func doSimulationStep(oldMap):
	var map = []
	for x in range(mapSize):
		map.append([])
		for y in range(mapSize):
			#count Neighbors in the old map
			var numNeighbors = countLivingNeighbors(oldMap, x, y)
			#if cell is alive
			if(oldMap[x][y]):
				if(numNeighbors < deathLimit):
					#kill cell
					map[x].append(false)
				else:
					#keep cell alive
					map[x].append(true)
			#if cell is dead
			else:
				if(numNeighbors > birthLimit):
					#revive cell
					map[x].append(true)
				else:
					#keep cell dead
					map[x].append(false)
	return map


#returns the number of cells in a ring around (x,y) that are alive
func countLivingNeighbors(map, x, y):
	var count = 0
	for i in range(-1,2):
		print(i)
		for j in range(-1,2):
			var nX = x+i
			var nY = y+j
			#ignore middle point
			if( not(nX == 0 and nY == 0) ):
				#if the neighbour if off the edge of the map
				if(nX < 0 or nY < 0 or nX == mapSize or nY == mapSize):
					#count it anyways
					count += 1
				#if neighbour is alive, count it
				elif(map[nX][nY]):
					count += 1
	return count  

Some ideas I think might work are:

  1. Spreading the generation over several frames (to at least prevent crashes… maybe)?
  2. Threads? But the process is sequential so it can’t be broken up, but maybe a background thread so the main thread isn’t stalled?
  3. Background loading??? Haven’t looked too far into this to know exactly how it works.
  4. Maybe append()is the issue and I need to figure out how to initialize a [1000][1000] 2D array?
  5. Maybe I need to go into C++ and make a module to do it?
  6. Maybe I need to just give up on large loops and run several small 100 x 100 loops as the player moves around… although this will affect the way the terrain generates and probably wouldn’t be suitable for my game…

Edit 1: moving the generateMap() function call to _init() instead of having it in _ready() will probably make it more stable… so it loads before the game starts and not during.

Edit 2 (aka someone beat me to it): I found another cellular automata project in Godot that uses it for it’s more intended purpose (caves). Although his doesn’t seem to handle large areas any better than mine :frowning:

Please look at this two issues For loop with true iterators instead of arrays and Game crashes on range(0,100000000)

puppetmaster- | 2016-04-11 09:20

I took a look but I’m still not sure if they fixed range() or not (commit reverted?). However, I converted all the double for loops into double while loops it now generates about 5% - 10% faster. But the generation rate is still way too slow…

batmanasb | 2016-04-11 10:31

GDScript optimization is still mostly nonexistent, I guess this should be improved before 3.0/3.1/4.0… maybe you can try writing it as C++?

Bojidar Marinov | 2016-04-11 10:59

Terrain generation should be easy to parallelize, unless you need to randomly access to any cell of space at any time. Using a thread is fine, but it would still take forever. Maybe you can find a way to break it up in chunks so you can generate smaller portions of the world (like Minecraft), and prioritize those the player can see. In case you have to deal with neighboring, data duplication is an option (increases memory usage but also increases performance because you can spawn more threads).
You can also pick another algorithm, Simplex/Perlin noise should do the trick pretty well too :slight_smile:

Zylann | 2016-04-11 17:01

I think I figured out a way to hide it from the user. I moved the generation process to its own thread, and when it finishes it saves it to a nextMap variable. This way it can generate the nextMap while the user is playing, and then simply swap to the new map when they restart. Also, I can save the nextMap variable to a file in case the user quits, so that it loads into it when the user launches the game, and save one of my own maps for the first time someone runs the game.

Also, I didn’t realize it was possible, but after a quick google search I now think cellular automata can be parallelized, or at least parts of it. I’m going to do a bit more research on this right after I improve how the tilemap loads the map tiles.

batmanasb | 2016-04-12 12:52

:bust_in_silhouette: Reply From: batmanasb

Okay so I managed to use threads to go through the while loops several rows in parallel. For example, here’s what my _ready() and initializeMap() methods changed into:

_ready():
    #make a bunch of threads for parrallel processing
	for i in range(numParallelThreads):
		parallelThreads.append(Thread.new())

#make a map where every cell has the same chance to spawn alive
func initializeMap():
	var map = []
	var x = 0
	while(x < mapSize):
		#assign each parallel thread to process a row
		var active = 0
		while(active < numParallelThreads and x < mapSize):
			map.append([])
			parallelThreads[active].start(self, "initializeRow", x)
			active += 1
			x += 1
		
		#wait for each thread to finish (note that this isn't the main thread, so waiting this long is okay)
		var i = active-1
		while(active > 0):
			insertRowFromThread(map, parallelThreads[i].wait_to_finish())
			i -= 1
			active -= 1
	return map


func initializeRow(x):
	var row = []
	var y = 0
	while(y < mapSize):
		row.append(randf()<initialChance)
		y += 1
	return [x,row]


func insertRowFromThread(map, data):
	#map[x] = row
	map[data[0]] = data[1]

Basically, how it works is instead of making one row at a time (like a double for loop would do), I instead made an array of threads (20 seemed to work the best, any more and performance starts to drop) and assigned a row to each thread, then waited for each one to return it’s value. This way I can process 20 rows at a time.

The result was a huge speed improvement, although it’s still nowhere near instant, but I figured out a way to hide it. (By generating the next map while you play, and loading the first map from a save file, which was generated when you last played) Here are some benchmarks I made (which are in milliseconds, so for example: 1k = 1000 milliseconds):

#size = 1000
#steps = 5
#1:67k
#2:48k
#10:39k
#20:38k
#100:39k

#size = 1000
#steps = 0
#1:0.7k
#2:0.6k
#10:0.6k
#20:0.6k
#100:0.6k

#size = 1000
#steps = 1
#1:14k
#2:10k
#10:8k
#20:8k
#100:8k

#size = 1000
#steps = 10
#1:133k
#2:97k
#10:80k
#20:78k
#100:80k small FPS drops

#size = 3000
#threads = 20
#5:342k
#3:215k
#2:142k
#1:72k
#0:3k
#69k per step 

Also note that these benchmarks get the time from when the game launches to when the map generated. (so before the tilemap begins loading the images… this part doesn’t appear to work in a thread, so I’m planning to call set_cell on nearby tiles as the player walks by)

Very useful information, good job!

patrik | 2016-04-13 19:11