rkochanowskiop4 hours ago
I built Slopo to solve one specific problem: finding similar code that is hardest to detect by other tools, coding AI agents, and humans.
It finds similar-looking code with embeddings. This detects more than just copy-paste clones or even clones with minor changes. Similar code is often not a clone to refactor, and this is a trade-off. Initial results need to be verified, but coding agents can do this quickly. Example prompts are available on https://slopo.dev
Additionally, similar code distant in the codebase is ranked higher to focus on less obvious duplication.
The results differ a lot depending on the codebase. I noticed that sometimes most of the detected duplicates are false positives, but the remaining ones are strong candidates to refactor or even bugs. Sometimes it reveals much more real duplication.
realxrobau3 hours ago
If it did PHP I would love to run it over WordPress. What would it take to add that?
rkochanowskiop3 hours ago
PHP support can be easily added, I will release a new version soon.
raro112 hours ago
Thank you
BrandiATMuhkuhan hour ago
What a simple and smart idea. Wonderful
forhadahmedan hour ago
self plug (for similar tool): https://github.com/forhadahmed/refactor
philajanan hour ago
This is neat. Have you noticed any difference in duplicate detection between strongly typed and loosely typed languages / code bases?
rkochanowskiop19 minutes ago
No. It depends the most on general code quality and architecture. Some implementations require more code similarity by design. Some languages, like Java, may tend to have more duplication, but it's only a theoretical guess. It also depends on what kind of software is developed with what language.
If you are interested in data, you can check my article. Analysis was done with this tool, but a previous version where exact-copy duplicates were excluded from analysis. https://rkochanowski.com/article/analysis-code-duplication/
murats3 hours ago
Nice idea. I can see this being useful before refactors, especially when the duplication is semantic rather than copy paste.
hdzan hour ago
Very nice. I can imagine putting this into a pre push hook to keep things clean after an initial sweep.
NYCHMPAI3 hours ago
[flagged]
SpyCoder77an hour ago
I think that this is pretty cool, but is there any reason why we would want to remove similar/possible duplicate code?
rufius21 minutes ago
(without sarcasm) Is this a serious question?
If so - maintainability, testability. This is old software engineering best practice at this point.
You shouldn’t hyper optimize for deduplication, but it’s usually worth considering. Fewer places to fix issues or improve as well.
Zopieux25 minutes ago
Have you written software before?