While completing a Magento 1 to 2 migration we started noticing issues when saving categories in the Magento 2 admin.
"The value specified in the URL Key field would generate a URL that already exists."
This issue is caused from Magento 1 not enforcing unique url_key product attribute values. Either through imports or duplicating products the url key can remain identical for multiple products.
When Magento tries to fetch a product's URL it sees this key is not unique and will then suffix the url with the products ID.
The system reports no issue and the store owner has poorly optimised search engine URLs 🤷♂️.
Magento 2 does not allow this duplication and so will instead throw an exception when generating product and category URLs when two url_key attributes values are identical in the system.
To see how many products are effected by this issue on your store the following module returns helpful info on these stats.
I had just over 3000! so I'm not going to do those by hand.
I needed to find the actual URLs of all products that have duplicate url_key attributes in the Magento 1 system.
To do this fetch these directly from mysql with the following query
SELECT sku, request_path from core_url_rewrite
INNER JOIN catalog_product_entity ON catalog_product_entity.entity_id = product_id
WHERE store_id = 1 AND target_path LIKE "%catalog/product/view/id%" AND target_path NOT LIKE "%category%";
This returns all SKU and the actual URLs keys used, these are the product URLs with the suffix included.
It is important for SEO and usability to keep the correct URLs after migration.
I exported this data to a CSV file and in a text editor removed the .html string from the request_path values. It depends on your Magento 1 config if that step is required or not.
I then uploaded this CSV file and wrote a quick PHP script to import these values to the correct products in Magento 2. You can see that script here
After running this the Magento 2 product url_key attributes were now all unique and identical to the Magento 1 returned product url values.
This minimises the chances of broken URLs when launching the new M2 site.
Now the Data integrity is reporting all clear!