+1 vote

Are there any built in functions for doing this? If not what is the best method in this language? I know python has a number of one liners that don't even require for loops, just curious if this is possible in GDScript as well.

asked Jun 2, 2016 in Engine by independentCog (59 points)

Do you want to know if there are duplicates or do you want all indexes that are duplicates?

just to see if the array that had values appended into it beforehand contains duplicates and than if it does it will change a bool that will decide whether a later if statement does something or not. In a nutshell what I'm trying to do.

3 Answers

+1 vote
Best answer

I'd say it depends on the size of the array. For small arrays (less than ~10) it's probably fastest to just loop as many times as there are entries as the overhead of creating a dictionary will be bigger than the benefit.
For large collections on the other hand I would do it the way it's usually done in lua: just use a dictionary with empty values. In other words do what Zylann recommended but don't rebuild the "seen"-dictionary every time.

answered Jun 3, 2016 by Warlaan (1,114 points)
selected Jun 4, 2016 by independentCog

Not sure if I implemented this correctly, I've never used integer arrays before so excuse my newbness.

func check_duplicates(a):
    var is_dupe = false

    for i in range(0, IntArray(a).size(), -1):
        if is_dupe == true:
            break
        for j in range(IntArray(a).size()):
            if IntArray(a)[j] == IntArray(a)[i] && i != j:
                is_dupe = true
                spell_cast_success = false
                print("duplicate")
                break

I'm pretty sure I made the changes you recommended but now it doesn't recognize duplicates. A code example of what you mean would be helpful.

So much for guessing performance. :-)
I did a quick test where I generated 5 numbers between 0 and 7 and added them to a container that checked for duplicates. That test ran in a loop 10.000 times for each of four containers: one that uses a dictionary to check for duplicates, one that uses the GDScript Array class, one that uses the IntArray class and one that uses a raw int-Array in C++ code.

These are the results:

** Debug Process Started **
Native code needed 44.984001ms.
IntArray code needed 262.967987ms.
Array code needed 165.378006ms.
Dictionary code needed 140.057007ms.
** Debug Process Stopped **

I am mostly surprised that the IntArray did not lead to better performance, but now that I checked the documentation it says that it's mostly meant to prevent fragmentation with large arrays, so at least it's not meant to be any faster for small arrays. I'll have to check the source to understand what it really does.

In any way the lesson here is that a native implementation is the fastest solution (not due to GDScript overhead since the GDScript interface of all these classes is the same, so the number of calls from GDScript to C++ should be identical) and in your case it won't matter which solution you use since if you run the test only once (i.e. add 5 items) the slowest solution will still need no more than 0.05 milliseconds (= 1/400th of a frame at 50fps).

FYI this is the test code:
(NativeSet and PerformanceTimer are C++ classes written by me, so don't google for them ;-) )

extends Node

var tries = 1
var size = 5
var items = 8

func _ready():
    var timer = PerformanceTimer.new()

    testNative()
    print("Native code needed " + str(timer.get_elapsed_ms()) + "ms.")

    timer.restart()
    testIntArray()
    print("IntArray code needed " + str(timer.get_elapsed_ms()) + "ms.")

    timer.restart()
    testArray()
    print("Array code needed " + str(timer.get_elapsed_ms()) + "ms.")

    timer.restart()
    testDictionary()
    print("Dictionary code needed " + str(timer.get_elapsed_ms()) + "ms.")


func testNative():
    var set = NativeSet.new()
    set.setup(size)
    test(set)

func testIntArray():
    var set = IntArraySet.new()
    test(set)

func testArray():
    var set = ArraySet.new()
    test(set)

func testDictionary():
    var set = DictionarySet.new()
    test(set)

func test(var set):
    for i in range(tries):
        for j in range(size):
            var newItem = randi() % items
            set.add(newItem)
        set.clear()

class IntArraySet:
    var array = IntArray([])

    func add(value):
        if (contains(value)):
            return false
        array.push_back(value)
        return true

    func contains(value):
        for val in array:
            if (val == value):
                return true
        return false

    func clear():
        array.resize(0)

class ArraySet:
    var array = []

    func add(value):
        if (contains(value)):
            return false
        array.push_back(value)
        return true

    func contains(value):
        for val in array:
            if (val == value):
                return true
        return false

    func clear():
        array.clear()

class DictionarySet:
    var array = {}

    func add(value):
        if (contains(value)):
            return false
        array[value] = 1
        return true

    func contains(value):
        return array.has(value)

    func clear():
        array.clear()

It also gives a good example I believe of the OOP you were talking about in the other post. It is interesting how proportionally faster the native c++ is compared to gdscripts array class. It makes more sense now why its recommended for more intense things. I also didn't think to look into the timer class that counts ms. I'll have to keep that in mind for further tests as well. So many different options for each node in this engine. Overall great info.

The timer class? Do you mean PerformanceTimer? That's a very simple class that I wrote myself. If you like I can give you the code, you'll just have to place it in the modules folder and compile Godot yourself (which thanks to SCons and very few dependencies is really easy to do).

http://docs.godotengine.org/en/latest/classes/class_timer.html?highlight=timer
Nope I meant this timer class. =P I haven't played with it enough yet. I need to.
But yeah I'd definitely be interested in seeing how it works and I've been wondering on the process of adding my own classes and nodes into Godot. The fact that I can do that with a game engine is too amazing to pass up.
EDIT: nm lol I just looked at the reference and realized the ms feature is from your class and not a part of godots timers. Just thought it would have something like that for some reason. I must of read the top part of your code here too fast and missed you were refering to performanceTimer with timer Oops.

+1 vote

This function will return which indexes of the array are containing duplicates.

func get_duplicates(a):
    if a.size() < 2:
        return []

    var seen = {}
    seen[a[0]] = true
    var duplicate_indexes = []

    for i in range(1, a.size()):
        var v = a[i]
        if seen.has(v):
            # Duplicate!
            duplicate_indexes.append(i)
        else:
            seen[v] = true

    return duplicate_indexes

There are faster ways without hashing on small arrays, but I don't think they make a difference in GDScript.

answered Jun 2, 2016 by Zylann (26,157 points)

The array itself is keyed into a dictionary, wasn't sure how using .has on that would affect it. I'll have to play with this method some to see what else I can do with it. Thanks =D

func check_duplicates(a):
    var seen = {}
    seen[a[0]] = true

    for i in range(1, a.size()):
        print(a[i])
        var v = a[i]
        if seen.has(v):
            bool_trigger = false
        else:
            seen[v] = true

I just modified it like this to do what I needed. Works like a charm. Your awesome!

0 votes

This is another version for smaller arrays:

func find_duplicates_small(a):
    var duplicates = []
    for i in range(0, a.size()):
        for j in range(j+1, a.size()):
            if a[j] == a[i]:
                duplicates.append(j)
    return duplicates

The idea is to loop a second time for each element, but only starting from the next so we can discard b == a because that's the same as a == b.

As for my first version, it will return the index of duplicates, but not the first occurences of the value. So for example, if you have ["a", "b", "c", "b"], the function will return [3]. If you want both you can adapt the function so it also inserts i the first time a duplicate is found. Or, if order doesn't matters, you can check the size of duplicates after the inner loop, then if it changed insert i.

Again, you can tweak it the way you like. You know your problem, so you could make this algorithm more efficient by adapting it for you need :)

answered Jun 4, 2016 by Zylann (26,157 points)
edited Jun 4, 2016 by Zylann

In godot there is a easier 1 line method to check the number of times an element is repeated in an array. The below method can be used to avoid iterating the whole length of the array.
Try something like
var count=arrayname.count(elementtobefound)
if count>0:
print("Element is present multiple times")

Welcome to Godot Engine Q&A, where you can ask questions and receive answers from other members of the community.

Please make sure to read How to use this Q&A? before posting your first questions.