Copilot & Submodules
A while ago I got access to the Github Copilot program. I've yet to be impressed by the tool but I figured sharing some of the results that it generated.
I figured it'd be fun to work on a fairlearn project while using the copilot suggestion feature. Here's the first suggestion that it generated.
It's a really interesting suggestion, but it's also wrong. The project doesn't
have a fairlearn.classification
submodule and it also doesn't have an EmpiricalClassifier
.
Here's what you ask the system to generate a paragraph.
Notice that TransparentLogisticClassifier
there? It's an interesting example
because it seems to parrot back to us that we're indeed interested in something
related to algorithmic fairness but it generates an API that simply does not exist.
Fancy a laugh?
Here's some more code that it keeps generating.
from fairlearn.classification.simulation import simulate_data
from fairlearn.classification.simulation import simulate_data_from_file
from fairlearn.classification.simulation import simulate_data_from_file_with_noise
from fairlearn.classification.simulation import simulate_data_from_file_with_noise_and_weights
from fairlearn.classification.simulation import simulate_data_from_file_with_noise_and_weights_and_class_balance
from fairlearn.classification.simulation import simulate_data_from_file_with_noise_and_weights_and_class_balance_and_class_imbalance
from fairlearn.classification.simulation import simulate_data_from_file_with_noise_and_weights_and_class_balance_and_class_imbalance_and_weights
from fairlearn.classification.simulation import simulate_data_from_file_with_noise_and_weights_and_class_balance_and_class_imbalance_and_weights_and_class_imbalance
from fairlearn.classification.simulation import simulate_data_from_file_with_noise_and_weights_and_class_balance_and_class_imbalance_and_weights_and_class_imbalance_and_class_balance
from fairlearn.classification.simulation import simulate_data_from_file_with_noise_and_weights_and_class_balance_and_class_imbalance_and_weights_and_class_imbalance_and_class_balance_and_class_imbalance
from fairlearn.classification.simulation import simulate_data_from_file_with_noise_and_weights_and_class_balance_and_class_imbalance_and_weights_and_class_imbalance_and_class_balance_and_class_imbalance_and
Sure, it generates code that *could* exist. But that doesn't mean the code does.
Generate a Script
So what happens when you ask it to generate more? You can see below how the comment is auto-completed.
From here it will parrot onwards.
It really looks like code that you might find on the internet, but is this useful? I'm not so sure. We seem to be dealing with a parrot here, not a co-pilot.
It's not just that the generated code won't run. It's also that the code seems to
go against a style guide. The code compares two approaches, but it generates code
that goes against the DRY principle. A for-loop may have been better, or better yet,
a GridSearchCV
object.