Recherche avancée

Médias (91)

Autres articles (78)

  • Ajouter notes et légendes aux images

    7 février 2011, par

    Pour pouvoir ajouter notes et légendes aux images, la première étape est d’installer le plugin "Légendes".
    Une fois le plugin activé, vous pouvez le configurer dans l’espace de configuration afin de modifier les droits de création / modification et de suppression des notes. Par défaut seuls les administrateurs du site peuvent ajouter des notes aux images.
    Modification lors de l’ajout d’un média
    Lors de l’ajout d’un média de type "image" un nouveau bouton apparait au dessus de la prévisualisation (...)

  • Mediabox : ouvrir les images dans l’espace maximal pour l’utilisateur

    8 février 2011, par

    La visualisation des images est restreinte par la largeur accordée par le design du site (dépendant du thème utilisé). Elles sont donc visibles sous un format réduit. Afin de profiter de l’ensemble de la place disponible sur l’écran de l’utilisateur, il est possible d’ajouter une fonctionnalité d’affichage de l’image dans une boite multimedia apparaissant au dessus du reste du contenu.
    Pour ce faire il est nécessaire d’installer le plugin "Mediabox".
    Configuration de la boite multimédia
    Dès (...)

  • Publier sur MédiaSpip

    13 juin 2013

    Puis-je poster des contenus à partir d’une tablette Ipad ?
    Oui, si votre Médiaspip installé est à la version 0.2 ou supérieure. Contacter au besoin l’administrateur de votre MédiaSpip pour le savoir

Sur d’autres sites (4284)

  • What Is Data Misuse & How to Prevent It ? (With Examples)

    13 mai 2024, par Erin

    Your data is everywhere. Every time you sign up for an email list, log in to Facebook or download a free app onto your smartphone, your data is being taken.

    This can scare customers and users who fear their data will be misused.

    While data can be a powerful asset for your business, it’s important you manage it well, or you could be in over your head.

    In this guide, we break down what data misuse is, what the different types are, some examples of major data misuse and how you can prevent it so you can grow your brand sustainably.

    What is data misuse ?

    Data is a good thing.

    It helps analysts and marketers understand their customers better so they can serve them relevant information, products and services to improve their lives.

    But it can quickly become a bad thing for both the customers and business owners when it’s mishandled and misused.

    What is data misuse?

    Data misuse is when a business uses data outside of the agreed-upon terms. When companies collect data, they need to legally communicate how that data is being used. 

    Who or what determines when data is being misused ?

    Several bodies :

    • User agreements
    • Data privacy laws
    • Corporate policies
    • Industry regulations

    There are certain laws and regulations around how you can collect and use data. Failure to comply with these guidelines and rules can result in several consequences, including legal action.

    Keep reading to discover the different types of data misuse and how to prevent it.

    3 types of data misuse

    There are a few different types of data misuse.

    If you fail to understand them, you could face penalties, legal trouble and a poor brand reputation.

    3 types of data misuse.

    1. Commingling

    When you collect data, you need to ensure you’re using it for the right purpose. Commingling is when an organisation collects data from a specific audience for a specific reason but then uses the data for another purpose.

    One example of commingling is if a company shares sensitive customer data with another company. In many cases, sister companies will share data even if the terms of the data collection didn’t include that clause.

    Another example is if someone collects data for academic purposes like research but then uses the data later on for marketing purposes to drive business growth in a for-profit company.

    In either case, the company went wrong by not being clear on what the data would be used for. You must communicate with your audience exactly how the data will be used.

    2. Personal benefit

    The second common way data is misused in the workplace is through “personal benefit.” This is when someone with access to data abuses it for their own gain.

    The most common example of personal benefit data muse is when an employee misuses internal data.

    While this may sound like each instance of data misuse is caused by malicious intent, that’s not always the case. Data misuse can still exist even if an employee didn’t have any harmful intent behind their actions. 

    One of the most common examples is when an employee mistakenly moves data from a company device to personal devices for easier access.

    3. Ambiguity

    As mentioned above, when discussing commingling, a company must only use data how they say they will use it when they collect it.

    A company can misuse data when they’re unclear on how the data is used. Ambiguity is when a company fails to disclose how user data is being collected and used.

    This means communicating poorly on how the data will be used can be wrong and lead to misuse.

    One of the most common ways this happens is when a company doesn’t know how to use the data, so they can’t give a specific reason. However, this is still considered misuse, as companies need to disclose exactly how they will use the data they collect from their customers.

    Laws on data misuse you need to follow

    Data misuse can lead to poor reputations and penalties from big tech companies. For example, if you step outside social media platforms’ guidelines, you could be suspended, banned or shadowbanned.

    But what’s even more important is certain types of data misuse could mean you’re breaking laws worldwide. Here are some laws on data misuse you need to follow to avoid legal trouble :

    General Data Protection Regulation (GDPR)

    The GDPR, or General Data Protection Regulation, is a law within the European Union (EU) that went into effect in 2018.

    The GDPR was implemented to set a standard and improve data protection in Europe. It was also established to increase accountability and transparency for data breaches within businesses and organisations.

    The purpose of the GDPR is to protect residents within the European Union.

    The penalties for breaking GDPR laws are fines up to 20 million Euros or 4% of global revenues (whatever the higher amount is).

    The GDPR doesn’t just affect companies in Europe. You can break the GDPR’s laws regardless of where your organisation is located worldwide. As long as your company collects, processes or uses the personal data of any EU resident, you’re subject to the GDPR’s rules.

    If you want to track user data to grow your business, you need to ensure you’re following international data laws. Tools like Matomo—the world’s leading privacy-friendly web analytics solution—can help you achieve GDPR compliance and maintain it.

    With Matomo, you can confidently enhance your website’s performance, knowing that you’re adhering to data protection laws. 

    Try Matomo for Free

    Get the web insights you need, without compromising data accuracy.

    No credit card required

    California Consumer Privacy Act (CCPA)

    The California Consumer Privacy Act (CCPA) is another important data law companies worldwide must follow.

    Like GDPR, the CCPA is a data privacy law established to protect residents of a certain region — in this case, residents of California in the United States.

    The CCPA was implemented in 2020, and businesses worldwide can be penalised for breaking the regulations. For example, if you’re found violating the CCPA, you could be fined $7,500 for each intentional violation.

    If you have unintentional violations, you could still be fined, but at a lesser fee of $2,500.

    The Gramm-Leach-Bliley Act (GLBA)

    If your business is located within the United States, then you’re subject to a federal law implemented in 1999 called The Gramm-Leach-Bliley Act (GLB Act or GLBA).

    The GLBA is also known as the Financial Modernization Act of 1999. Its purpose is to control the way American financial institutions handle consumer data. 

    In the GLBA, there are three sections :

    1. The Financial Privacy Rule : regulates the collection and disclosure of private financial data.
    2. Safeguards Rule : Financial institutions must establish security programs to protect financial data.
    3. Pretexting Provisions : Prohibits accessing private data using false pretences.

    The GLBA also requires financial institutions in the U.S. to give their customers written privacy policy communications that explain their data-sharing practices.

    4 examples of data misuse in real life

    If you want to see what data misuse looks like in real life, look no further.

    Big tech is central to some of the biggest data misuses and scandals.

    4 examples of data misuse in real life.

    Here are a few examples of data misuse in real life you should take note of to avoid a similar scenario :

    1. Facebook election interference

    One of history’s most famous examples of data misuse is the Facebook and Cambridge Analytica scandal in 2018.

    During the 2018 U.S. midterm elections, Cambridge Analytica, a political consulting firm, acquired personal data from Facebook users that was said to have been collected for academic research.

    Instead, Cambridge Analytica used data from roughly 87 million Facebook users. 

    This is a prime example of commingling.

    The result ? Cambridge Analytica was left bankrupt and dissolved, and Facebook was fined $5 billion by the Federal Trade Commission (FTC).

    2. Uber “God View” tracking

    Another big tech company, Uber, was caught misusing data a decade ago. 

    Why ?

    Uber implemented a new feature for its employees in 2014 called “God View.”

    The tool enabled Uber employees to track riders using their app. The problem was that they were watching them without the users’ permission. “God View” lets Uber spy on their riders to see their movements and locations.

    The FTC ended up slapping them with a major lawsuit, and as part of their settlement agreement, Uber agreed to have an outside firm audit their privacy practices between 2014 and 2034.

    Uber "God View."

    3. Twitter targeted ads overstep

    In 2019, Twitter was found guilty of allowing advertisers to access its users’ personal data to improve advertisement targeting.

    Advertisers were given access to user email addresses and phone numbers without explicit permission from the users. The result was that Twitter ad buyers could use this contact information to cross-reference with Twitter’s data to serve ads to them.

    Twitter stated that the data leak was an internal error. 

    4. Google location tracking

    In 2020, Google was found guilty of not explicitly disclosing how it’s using its users’ personal data, which is an example of ambiguity.

    The result ?

    The French data protection authority fined Google $57 million.

    8 ways to prevent data misuse in your company

    Now that you know the dangers of data misuse and its associated penalties, it’s time to understand how you can prevent it in your company.

    How to prevent data misuse in your company.

    Here are eight ways you can prevent data misuse :

    1. Track data with an ethical web analytics solution

    You can’t get by in today’s business world without tracking data. The question is whether you’re tracking it safely or not.

    If you want to ensure you aren’t getting into legal trouble with data misuse, then you need to use an ethical web analytics solution like Matomo.

    With it, you can track and improve your website performance while remaining GDPR-compliant and respecting user privacy. Unlike other web analytics solutions that monetise your data and auction it off to advertisers, with Matomo, you own your data.

    Try Matomo for Free

    Get the web insights you need, without compromising data accuracy.

    No credit card required

    2. Don’t share data with big tech

    As the data misuse examples above show, big tech companies often violate data privacy laws.

    And while most of these companies, like Google, appear to be convenient, they’re often inconvenient (and much worse), especially regarding data leaks, privacy breaches and the sale of your data to advertisers.

    Have you ever heard the phrase : “You are the product ?” When it comes to big tech, chances are if you’re getting it for free, you (and your data) are the products they’re selling.

    The best way to stop sharing data with big tech is to stop using platforms like Google. For more ideas on different Google product alternatives, check out this list of Google alternatives.

    3. Identity verification 

    Data misuse typically isn’t a company-wide ploy. Often, it’s the lack of security structure and systems within your company. 

    An important place to start is to ensure proper identity verification for anyone with access to your data.

    4. Access management

    After establishing identity verification, you should ensure you have proper access management set up. For example, you should only give specific access to specific roles in your company to prevent data misuse.

    5. Activity logs and monitoring

    One way to track data misuse or breaches is by setting up activity logs to ensure you can see who is accessing certain types of data and when they’re accessing it.

    You should ensure you have a team dedicated to continuously monitoring these logs to catch anything quickly.

    6. Behaviour alerts 

    While manually monitoring data is important, it’s also good to set up automatic alerts if there is unusual activity around your data centres. You should set up behaviour alerts and notifications in case threats or compromising events occur.

    7. Onboarding, training, education

    One way to ensure quality data management is to keep your employees up to speed on data security. You should ensure data security is a part of your employee onboarding. Also, you should have regular training and education to keep people informed on protecting company and customer data.

    8. Create data protocols and processes 

    To ensure long-term data security, you should establish data protocols and processes. 

    To protect your user data, set up rules and systems within your organisation that people can reference and follow continuously to prevent data misuse.

    Leverage data ethically with Matomo

    Data is everything in business.

    But it’s not something to be taken lightly. Mishandling user data can break customer trust, lead to penalties from organisations and even create legal trouble and massive fines.

    You should only use privacy-first tools to ensure you’re handling data responsibly.

    Matomo is a privacy-friendly web analytics tool that collects, stores and tracks data across your website without breaking privacy laws.

    With over 1 million websites using Matomo, you can track and improve website performance with :

    • Accurate data (no data sampling)
    • Privacy-friendly and compliant with privacy regulations like GDPR, CCPA and more
    • Advanced features like heatmaps, session recordings, A/B testing and more

    Try Matomo free for 21-days. No credit card required.

  • Protecting consumer privacy : How to ensure CCPA compliance

    18 août 2023, par Erin — CCPA, Privacy

    The California Consumer Privacy Act (CCPA) is a state law that enhances privacy rights and consumer protection for residents of California. 

    It grants consumers six rights, like the right to know what personal information is being collected about them by businesses and others. 

    CCPA also requires businesses to provide notice of data collection practices. Consumers can choose to opt out of the sale of their data. 

    In this article, we’ll learn more about the scope of CCPA, the penalties for non-compliance and how our web analytics tool, Matomo, can help you create a CCPA-compliant framework.

    What is the CCPA ? 

    CCPA was implemented on January 1, 2020. It ensures that businesses securely handle individuals’ personal information and respect their privacy in the digital ecosystem. 

    How does CCPA compliance add value

    CCPA addresses the growing concerns over privacy and data protection ; 40% of US consumers share that they’re worried about digital privacy. With the increasing amount of personal information being collected and shared by businesses, there was a need to establish regulations to provide individuals with more control and transparency over their data. 

    CCPA aims to protect consumer privacy rights and promote greater accountability from businesses when handling personal information.

    Scope of CCPA 

    The scope of CCPA includes for-profit businesses that collect personal information from California residents, regardless of where you run the business from.

    It defines three thresholds that determine the inclusion criteria for businesses subject to CCPA regulations. 

    Businesses need to abide by CCPA if they meet any of the three options :

    1. Revenue threshold : Have an annual gross revenue of over $25 million.
    2. Consumer threshold : Businesses that purchase, sell or distribute the personal information of 100,000 or more consumers, households or devices.
    3. Data threshold : Businesses that earn at least half of their revenue annually from selling the personal information of California residents.

    What are the six consumer rights under the CCPA ? 

    Here’s a short description of the six consumer rights. 

    The six rights of consumers under CCPA
    1. Right to know : Under this right, you can ask a business to disclose specific personal information they collect about you and the categories of sources of the information. You can also know the purpose of collection and to which third-party the business will disclose this info. This allows consumers to understand what information is being held and how it is used. You can request this info for free twice a year.
    2. Right to delete : Consumers can request the deletion of their personal information. Companies must comply with some exceptions.
    3. Right to opt-out : Consumers can deny the sale of their personal information. Companies must provide a link on their homepage for users to exercise this right. After you choose this, companies can’t sell your data unless you authorise them to do so later.
    4. Right to non-discrimination : Consumers cannot be discriminated against for exercising their CCPA rights. For instance, a company cannot charge different prices, provide a different quality of service or deny services.
    5. Right to correct : Consumers can request to correct inaccurate personal information.

    6. Right to limit use : Consumers can specify how they want the businesses to use their sensitive personal information. This includes social security numbers, financial account details, precise geolocation data or genetic data. Consumers can direct businesses to use this sensitive information only for specific purposes, such as providing the requested services.

    Penalties for CCPA non-compliance 

    52% of organisations have yet to adopt CCPA principles as of 2022. Non-compliance can attract penalties.

    Section 1798.155 of the CCPA states that any business that doesn’t comply with CCPA’s terms can face penalties based on the consumer’s private right to action. Consumers can directly take the company to the civil court and don’t need prosecutors’ interventions. 

    Businesses get a chance of 30 days to make amends for their actions. 

    If that’s also not possible, the business may receive a civil penalty of up to $2,500 per violation. Violations can be of any kind, even accidental. An intentional violation can attract a fine of $7,500. 

    Consumers can also initiate private lawsuits to claim damages that range from $100 to $750, or actual damages (whichever is higher), for each occurrence of their unredacted and unencrypted data being breached on a business’s server.

    CCPA vs. GDPR 

    Both CCPA and GDPR aim to enhance individuals’ control over their personal information and provide transparency about how their data is collected, used and shared. The comparison between the CCPA and GDPR is crucial in understanding the regulatory framework of data protection laws.

    Here’s how CCPA and GDPR differ :

    Scope

    • CCPA is for businesses that meet specific criteria and collect personal information from California residents. 
    • GDPR (General Data Protection Regulation) applies to businesses that process the personal data of citizens and residents of the European Union.

    Definition of personal information

    • CCPA includes personal information broadly, including identifiers such as IP addresses and households. Examples include name, email id, location and browsing history. However, it excludes HIPAA-protected medical data, clinical trial data and other personal information from government records.
    • GDPR covers any personal data relating to an identified or identifiable individual, excluding households. Examples include the phone number, email address and personal identification number. It excludes anonymous and deceased person’s data.
    Personal information definition under CCPA and GDPR

    Consent

    • Under the CCPA, consumers can opt out of the sale of their personal information.
    • GDPR states that organisations should obtain explicit consent from individuals for processing their personal data.

    Rights

    • CCPA grants the right to know what personal information is being collected and the right to request deletion of their personal information.
    • GDPR also gives individuals various rights, such as the right to access and rectify their personal data, the right to erasure (also known as the right to be forgotten) and also the right to data portability. 

    Enforcement

    • For CCPA, businesses may have to pay $7,500 for each violation. 
    • GDPR has stricter penalties for non-compliance, with fines of up to 4% of the global annual revenue of a company or €20 million, whichever is higher.

    A 5-step CCPA compliance framework 

    Here’s a simple framework you can follow to ensure compliance with CCPA. Alongside this, we’ll also share how Matomo can help. 

    Matomo is an open-source web analytics platform trusted by organisations like the United Nations, NASA and more. It provides valuable insights into website traffic, visitor behaviour and marketing effectiveness. More than 1 million websites and apps (approximately 1% of the internet !) use our solution, and it’s available in 50+ languages. Below, we’ll share how you can use Matomo to be CCPA compliant.

    1. Assess data

    First, familiarise yourself with the California Consumer Privacy Act and check your eligibility for CCPA compliance. 

    For example, as mentioned earlier, one threshold is : purchases, receives or sells the personal data of 100,000 or more individuals or households

    But how do you know if you have crossed 100K ? With Matomo ! 

    Go to last year’s calendar, select visitors, then go to locations and under the “Region” option, check for California. If you’ve crossed 100K visitors, you know you have to become CCPA compliant.

    View geolocation traffic details in Matomo

    Identify and assess the personal information you collect with Matomo.

    2. Evaluate privacy practices

    Review the current state of your privacy policies and practices. Conduct a thorough assessment of data sharing and third-party agreements. Then, update policies and procedures to align with CCPA requirements.

    For example, you can anonymise IP addresses with Matomo to ensure that user data collected for web analytics purposes cannot be used to trace back to specific individuals.

    Using Matomo to anonymize visitors' IP addresses

    If you have a consent management solution to honour user requests for data privacy, you can also integrate Matomo with it. 

    3. Communicate 

    Inform consumers about their CCPA rights and how you handle their data.

    Establish procedures for handling consumer requests and obtaining consent. For example, you can add an opt-out form on your website with Matomo. Or you can also use Matomo to disable cookies from your website.

    Screenshot of a command line disabling cookies

    Documenting your compliance efforts, including consumer requests and how you responded to them, is a good idea. Finally, educate staff on CCPA compliance and their responsibilities to work collaboratively.

    4. Review vendor contracts

    Assessing vendor contracts allows you to determine if they include necessary data processing agreements. You can also identify if vendors are sharing personal information with third parties, which could pose a compliance risk. Verify if vendors have adequate security measures in place to protect the personal data they handle.

    That’s why you can review and update agreements to include provisions for data protection, privacy and CCPA requirements.

    Establish procedures to monitor and review vendor compliance with CCPA regularly. This may include conducting audits, requesting certifications and implementing controls to mitigate risks associated with vendors handling personal data.

    5. Engage legal counsel

    Consider consulting with legal counsel to ensure complete understanding and compliance with CCPA regulations.

    Finally, stay updated on any changes or developments related to CCPA and adjust your compliance efforts accordingly.

    Matomo and CCPA compliance 

    There’s an increasing emphasis on privacy regulations like CCPA. Matomo offers a robust solution that allows businesses to be CCPA-compliant without sacrificing the ability to track and analyse crucial data.

    You can gain in-depth insights into user behaviour and website performance — all while prioritising data protection and privacy. 

    Request a demo or sign up for a free 21-day trial to get started with our powerful CCPA-compliant web analytics platform — no credit card required. 

    Disclaimer

    We are not lawyers and don’t claim to be. The information provided here is to help give an introduction to CCPA. We encourage every business and website to take data privacy seriously and discuss these issues with your lawyer if you have any concerns.

  • Developing MobyCAIRO

    26 mai 2021, par Multimedia Mike — General

    I recently published a tool called MobyCAIRO. The ‘CAIRO’ part stands for Computer-Assisted Image ROtation, while the ‘Moby’ prefix refers to its role in helping process artifact image scans to submit to the MobyGames database. The tool is meant to provide an accelerated workflow for rotating and cropping image scans. It works on both Windows and Linux. Hopefully, it can solve similar workflow problems for other people.

    As of this writing, MobyCAIRO has not been tested on Mac OS X yet– I expect some issues there that should be easily solvable if someone cares to test it.

    The rest of this post describes my motivations and how I arrived at the solution.

    Background
    I have scanned well in excess of 2100 images for MobyGames and other purposes in the past 16 years or so. The workflow looks like this :


    Workflow diagram

    Image workflow


    It should be noted that my original workflow featured me manually rotating the artifact on the scanner bed in order to ensure straightness, because I guess I thought that rotate functions in image editing programs constituted dark, unholy magic or something. So my workflow used to be even more arduous :


    Longer workflow diagram

    I can’t believe I had the patience to do this for hundreds of scans


    Sometime last year, I was sitting down to perform some more scanning and found myself dreading the oncoming tedium of straightening and cropping the images. This prompted a pivotal question :


    Why can’t a computer do this for me ?

    After all, I have always been a huge proponent of making computers handle the most tedious, repetitive, mind-numbing, and error-prone tasks. So I did some web searching to find if there were any solutions that dealt with this. I also consulted with some like-minded folks who have to cope with the same tedious workflow.

    I came up empty-handed. So I endeavored to develop my own solution.

    Problem Statement and Prior Work

    I want to develop a workflow that can automatically rotate an image so that it is straight, and also find the most likely crop rectangle, uniformly whitening the area outside of the crop area (in the case of circles).

    As mentioned, I checked to see if any other programs can handle this, starting with my usual workhorse, Photoshop Elements. But I can’t expect the trimmed down version to do everything. I tried to find out if its big brother could handle the task, but couldn’t find a definitive answer on that. Nor could I find any other tools that seem to take an interest in optimizing this particular workflow.

    When I brought this up to some peers, I received some suggestions, including an idea that the venerable GIMP had a feature like this, but I could not find any evidence. Further, I would get responses of “Program XYZ can do image rotation and cropping.” I had to tamp down on the snark to avoid saying “Wow ! An image editor that can perform rotation AND cropping ? What a game-changer !” Rotation and cropping features are table stakes for any halfway competent image editor for the last 25 or so years at least. I am hoping to find or create a program which can lend a bit of programmatic assistance to the task.

    Why can’t other programs handle this ? The answer seems fairly obvious : Image editing tools are general tools and I want a highly customized workflow. It’s not reasonable to expect a turnkey solution to do this.

    Brainstorming An Approach
    I started with the happiest of happy cases— A disc that needed archiving (a marketing/press assets CD-ROM from a video game company, contents described here) which appeared to have some pretty clear straight lines :


    Ubisoft 2004 Product Catalog CD-ROM

    My idea was to try to find straight lines in the image and then rotate the image so that the image is parallel to the horizontal based on the longest single straight line detected.

    I just needed to figure out how to find a straight line inside of an image. Fortunately, I quickly learned that this is very much a solved problem thanks to something called the Hough transform. As a bonus, I read that this is also the tool I would want to use for finding circles, when I got to that part. The nice thing about knowing the formal algorithm to use is being able to find efficient, optimized libraries which already implement it.

    Early Prototype
    A little searching for how to perform a Hough transform in Python led me first to scikit. I was able to rapidly produce a prototype that did some basic image processing. However, running the Hough transform directly on the image and rotating according to the longest line segment discovered turned out not to yield expected results.


    Sub-optimal rotation

    It also took a very long time to chew on the 3300×3300 raw image– certainly longer than I care to wait for an accelerated workflow concept. The key, however, is that you are apparently not supposed to run the Hough transform on a raw image– you need to compute the edges first, and then attempt to determine which edges are ‘straight’. The recommended algorithm for this step is the Canny edge detector. After applying this, I get the expected rotation :


    Perfect rotation

    The algorithm also completes in a few seconds. So this is a good early result and I was feeling pretty confident. But, again– happiest of happy cases. I should also mention at this point that I had originally envisioned a tool that I would simply run against a scanned image and it would automatically/magically make the image straight, followed by a perfect crop.

    Along came my MobyGames comrade Foxhack to disabuse me of the hope of ever developing a fully automated tool. Just try and find a usefully long straight line in this :


    Nascar 07 Xbox Scan, incorrectly rotated

    Darn it, Foxhack…

    There are straight edges, to be sure. But my initial brainstorm of rotating according to the longest straight edge looks infeasible. Further, it’s at this point that we start brainstorming that perhaps we could match on ratings badges such as the standard ESRB badges omnipresent on U.S. video games. This gets into feature detection and complicates things.

    This Needs To Be Interactive
    At this point in the effort, I came to terms with the fact that the solution will need to have some element of interactivity. I will also need to get out of my safe Linux haven and figure out how to develop this on a Windows desktop, something I am not experienced with.

    I initially dreamed up an impressive beast of a program written in C++ that leverages Windows desktop GUI frameworks, OpenGL for display and real-time rotation, GPU acceleration for image analysis and processing tricks, and some novel input concepts. I thought GPU acceleration would be crucial since I have a fairly good GPU on my main Windows desktop and I hear that these things are pretty good at image processing.

    I created a list of prototyping tasks on a Trello board and made a decent amount of headway on prototyping all the various pieces that I would need to tie together in order to make this a reality. But it was ultimately slowgoing when you can only grab an hour or 2 here and there to try to get anything done.

    Settling On A Solution
    Recently, I was determined to get a set of old shareware discs archived. I ripped the data a year ago but I was blocked on the scanning task because I knew that would also involve tedious straightening and cropping. So I finally got all the scans done, which was reasonably quick. But I was determined to not manually post-process them.

    This was fairly recent, but I can’t quite recall how I managed to come across the OpenCV library and its Python bindings. OpenCV is an amazing library that provides a significant toolbox for performing image processing tasks. Not only that, it provides “just enough” UI primitives to be able to quickly create a basic GUI for your program, including image display via multiple windows, buttons, and keyboard/mouse input. Furthermore, OpenCV seems to be plenty fast enough to do everything I need in real time, just with (accelerated where appropriate) CPU processing.

    So I went to work porting the ideas from the simple standalone Python/scikit tool. I thought of a refinement to the straight line detector– instead of just finding the longest straight edge, it creates a histogram of 360 rotation angles, and builds a list of lines corresponding to each angle. Then it sorts the angles by cumulative line length and allows the user to iterate through this list, which will hopefully provide the most likely straightened angle up front. Further, the tool allows making fine adjustments by 1/10 of an angle via the keyboard, not the mouse. It does all this while highlighting in red the straight line segments that are parallel to the horizontal axis, per the current candidate angle.


    MobyCAIRO - rotation interface

    The tool draws a light-colored grid over the frame to aid the user in visually verifying the straightness of the image. Further, the program has a mode that allows the user to see the algorithm’s detected edges :


    MobyCAIRO - show detected lines

    For the cropping phase, the program uses the Hough circle transform in a similar manner, finding the most likely circles (if the image to be processed is supposed to be a circle) and allowing the user to cycle among them while making precise adjustments via the keyboard, again, rather than the mouse.


    MobyCAIRO - assisted circle crop

    Running the Hough circle transform is a significantly more intensive operation than the line transform. When I ran it on a full 3300×3300 image, it ran for a long time. I didn’t let it run longer than a minute before forcibly ending the program. Is this approach unworkable ? Not quite– It turns out that the transform is just as effective when shrinking the image to 400×400, and completes in under 2 seconds on my Core i5 CPU.

    For rectangular cropping, I just settled on using OpenCV’s built-in region-of-interest (ROI) facility. I tried to intelligently find the best candidate rectangle and allow fine adjustments via the keyboard, but I wasn’t having much success, so I took a path of lesser resistance.

    Packaging and Residual Weirdness
    I realized that this tool would be more useful to a broader Windows-using base of digital preservationists if they didn’t have to install Python, establish a virtual environment, and install the prerequisite dependencies. Thus, I made the effort to figure out how to wrap the entire thing up into a monolithic Windows EXE binary. It is available from the project’s Github release page (another thing I figured out for the sake of this project !).

    The binary is pretty heavy, weighing in at a bit over 50 megabytes. You might advise using compression– it IS compressed ! Before I figured out the --onefile command for pyinstaller.exe, the generated dist/ subdirectory was 150 MB. Among other things, there’s a 30 MB FORTRAN BLAS library packaged in !

    Conclusion and Future Directions
    Once I got it all working with a simple tkinter UI up front in order to select between circle and rectangle crop modes, I unleashed the tool on 60 or so scans in bulk, using the Windows forfiles command (another learning experience). I didn’t put a clock on the effort, but it felt faster. Of course, I was livid with proudness the whole time because I was using my own tool. I just wish I had thought of it sooner. But, really, with 2100+ scans under my belt, I’m just getting started– I literally have thousands more artifacts to scan for preservation.

    The tool isn’t perfect, of course. Just tonight, I threw another scan at MobyCAIRO. Just go ahead and try to find straight lines in this specimen :


    Reading Who? Reading You! CD-ROM

    I eventually had to use the text left and right of center to line up against the grid with the manual keyboard adjustments. Still, I’m impressed by how these computer vision algorithms can see patterns I can’t, highlighting lines I never would have guessed at.

    I’m eager to play with OpenCV some more, particularly the video processing functions, perhaps even some GPU-accelerated versions.

    The post Developing MobyCAIRO first appeared on Breaking Eggs And Making Omelettes.